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») Abstract 

We develop rare-event simulation methodology for the analysis of loss events in a many- 
server loss system under quality-driven regime, focusing on the steady-state loss probability 
(i.e. fraction of lost customers over arrivals) and the behavior of the whole system leading to 
loss events. The analysis of these events requires working with the full measure-valued process 
describing the system. This is the first algorithm that is shown to be asymptotically optimal, 
in the rare-event simulation context, under the setting of many-server queues involving a full 
measure- valued descriptor. 

Oh While there is vast literature on rare-event simulation algorithms for queues with fixed number 

4^ of servers, few algorithms exist for queueing systems with many servers. In systems with single 

or a fixed number of servers, random walk representations are often used to analyze associated 
rare events (see for example Siegmund (1976), Asmussen (1985), Anantharam (1988), Sadowsky 
(1991) and Heidelberger (1995)). The difficulty in these types of systems arises from the boundary 
behavior induced by the positivity constraints inherent to queueing systems. Many-server systems 
are, in some sense, less sensitive to boundary behavior (as we shall demonstrate in the basic de- 
velopment of our ideas) but instead the challenge in their rare-event analysis lies on the fact that 
the system description is typically infinite dimensional (measure- valued) . One of the goals of this 
paper, broadly speaking, is to propose methodology and techniques that we believe are applica- 
t-h ble to a wide range of rare-event problems involving many-server systems. In particular, we will 

demonstrate how measure-valued description is both necessary and useful for efficient simulation. 
This arises primarily from the intimate relation between the steady-state large deviations behav- 
ior and the measure-valued diffusion approximation of many-server systems. As far as we know, 
the algorithm proposed in this paper is the first provably asymptotically optimal algorithm (in a 
sense that we will explain shortly) that involves such measure-valued descriptor in the rare-event 
C$ simulation literature. 

In order to illustrate our ideas we focus on the problem of estimating the steady-state loss 
probability in many-server loss systems. We consider a system with general i.i.d. interarrival times 
and service times (both under suitable tail conditions). The system has s servers and no waiting 
room. If a customer arrives and finds a server empty, he immediately starts service occupying a 
server. If the customer finds all the servers busy, he leaves the system immediately and the system 
incurs a "loss" . The steady-state loss probability (i.e. the long term proportion of customers that 
are lost) is rare if the traffic intensity (arrival rate into the system / total service rate) is less 
than one and the number of servers is large. This is precisely the asymptotic environment that we 
consider. 

Related large deviations and simulation results include the work of Glynn (1995), who developed 
large deviations asymptotics for the number-in-system of an infinite-server queue with high arrival 
rates. Based on this result, Szechtman and Glynn (2002) developed a corresponding rare-event 
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algorithm for the same quantity of an infinite-server queue, using a sequential tilting scheme that 
mimics the optimal exponential change of measure. Related results for first passage time probabili- 
ties have also been obtained by Ridder (2009) in the setting of Markovian queues. Blanchet, Glynn 
and Lam (2009) constructed an algorithm for the steady-state loss probability of a slotted-time 
M/G/s system with bounded service time. The algorithm in Blanchet, Glynn and Lam (2009) is 
the closest in spirit to our methodology here, but the slotted-time nature, the Markovian structure 
and the fact that the service times were bounded were used in a crucial way to avoid the main 
technical complications involved in dealing with measure-valued descriptors. 

In this paper we focus on the steady-state loss estimation of a fully continuous GI/G/s system 
with service times that accommodate most distributions used in practice, including mixtures of 
exponentials, Weibull and lognormal distributions. A key element of our algorithm, in addition 
to the use of measure-valued process, is the application of weak convergence limits by Krichagina 
and Puhalskii (1997) and Pang and Whitt (2009). As we shall see, the weak convergence results 
are necessary because via a suitable extension of regenerative-type simulation (see Section 2) the 
steady-state loss probability of the system can be transformed to a first passage problem of the 
measure-valued process starting from an appropriate set, suitably chosen by means of such weak 
convergence analysis. However, unlike infinite-server system, the capacity constraint (s servers) 
introduces a boundary that forces us to work with the sample path and to tract the whole process 
history. We will also see that the properties (and especially "decay" behavior) of the steady- 
state measure plays an important role in controlling the efficiency of the algorithm in the case of 
unbounded service time. In fact, new logarithmic asymptotic results of steady-state convergence 
(in the sense described in Section 4) are derived along our way to prove algorithmic efficiency. 

Our main methodology to construct an efficient algorithm is based on importance sampling, 
which is a variance reduction technique that biases the probability measure of the system (via a 
so-called change of measure) to enhance the occurrence of rare event. In order to correct for the 
bias, a likelihood ratio is multiplied to the sample output to maintain unbiasedness. The key to 
efficiency is then to control the likelihood ratio, which is typically small, and hence favorable, when 
the change of measure resembles the conditional distribution given the occurrence of rare event. 
Construction of good changes of measure often draws on associated large deviations theory (see 
Asmussen and Glynn (2007), Chapter 6). We will carry out this scheme of ideas in subsequent 
sections. 

The criterion of efficiency that we will be using is the so-called asymptotic optimality (or log- 
arithmic efficiency). More concretely, suppose we want to estimate some probability a := a(s) 
that goes to as s / oo. For any unbiased estimator X of a (i.e. a = EX) one must have 
EX 2 > (EX) 2 = a 2 by Jensen's inequality. Asymptotic optimality requires that a 2 is also an 
upper bound of the estimator's variance in terms of exponential decay rate. In other words, 

. logEX 2 
hmmi — ^— = 1. 

s^oo log OL 

This implies that the estimator X possesses the optimal exponential decay rate any unbiased es- 
timator can possibly achieve. See, for example, Bucklew (2004), Asmussen and Glynn (2007) and 
Juneja and Shahabuddin (2006) for further details on asymptotic optimality. 

Finally, we emphasize the potential applications of loss estimation in many-server systems. One 
prominent example is call center analysis. Customer support centers, intra-company phone systems 
and emergency rooms, among others, typically have fixed system capacity above which calls would 
be lost. In many situations losses are rare, yet their implications can be significant. The most 
extreme example is perhaps 911 center in which any call loss can be life-threatening. In view 
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of this, an accurate estimate (at least to the order of magnitude) of loss probability is often an 
indispensable indicator of system performance. While in this paper we focus on i.i.d. interarrival 
and service times, under mild modifications, our methodology can be adapted to different model 
assumptions such as Markov-modulation and time inhomogeneity that arise naturally in certain 
application environments. As a side tale, a rather surprising and novel application of the present 
methodology is in the context of actuarial loss in insurance and pension funds. In such systems 
the policyholders (insurance contract or pension scheme buyers) are the "customers", and "loss" 
is triggered not by an exceedence of the number of customers but rather by a cash overflow of the 
insurer. Under suitable model assumptions, the latter can be expressed as a functional of the past 
system history whereby the measure-valued descriptor becomes valuable. The full development of 
this application is presented in Blanchet and Lam (2011). 

The organization of the paper is as follows. In Section 1 we will indicate our main results 
and lay out our GI/G/s model assumptions. In Section 2 we will explain and describe in detail 
our simulation methodology. Section 3 will focus on the proof of algorithmic efficiency and large 
deviations asymptotics, while Section 4 will be devoted to the use of weak convergence results 
mentioned earlier for the design of an appropriate recurrent set. Finally, we will provide numerical 
results in Section 5, and technical details are left to the appendix. 



1 Main Results and Contributions 

1.1 Problem Formulation and Main Results 

In this subsection we describe our problem formulation, and discuss our main results. At a general 
level, our main contribution in this paper is the development of methodology for efficient rare- 
event analysis of the steady- state behavior of many-server systems in a quality driven regime. Our 
methodology, however, is suitable for transient rare-event analysis assuming the initial condition of 
the system is within the diffusion scale from the fluid limit of the system. 

The main idea of our methodology is to first introduce a coupling with the infinite server queue. 
Second, take advantage of a suitable ratio representation for the associated probability of interest 
for the system in consideration (in our case a loss system). Third, identify a suitable regenerative- 
like set based on available results in the literature on diffusion approximations for the system in 
consideration. Finally, identify a rare-event of interest inside a cycle that is common to both the 
system in consideration and the infinite-server system, and that has the same asymptotics as the 
probability of interest. It is crucial for the last step to select the regenerative-like set carefully. We 
concentrate on loss probabilities in this paper, but an almost identical (asymptotically optimal) 
algorithm can be obtained for the steady-state probability of delay in a many-server queue under 
the quality driven regime (when the traffic intensity is bounded away from 1 as the number of 
servers and the arrival rate grow to infinity at the same rate). 

Throughout the rest of the paper we concentrate on loss systems and develop the four elements 
outlined in the previous paragraph for the evaluation of steady-state loss probabilities, which are 
defined as 

„ „ , number of losses up to T 

Pjloss) = lim , , v -. (1 

T^oo number of arrivals up to T 

Kac's formula (see Breiman (1968)) allows to express the loss probability as 
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where A is a set that is visited by the chain infinitely often. The expectation Ea[-] denotes the 
expectation with initial state distributed according to the steady-state distribution conditioned on 
being in A. The quantity Na is the number of loss before returning to set A, and r a is the time 
back to A. Moreover, As is the arrival rate (which is assumed to scale linearly with the number of 
servers s; the full discussion of our scaling assumptions will be laid out in the next subsection). For 
now, let us mention that both EaNa and Eat a are also dependent on the parameter s because of 
the scaling. 

Note that ([I]) cannot be directly simulated, but formula ^ provides a basis for regenerative- 
type simulation (see Asmussen and Glynn (2007), Chapter 4). After identifying a recurrent set A, a 
straightforward crude Monte Carlo strategy would be to run the system for a long time from some 
initial state, take a record of Na and t a every time it hits A, and output the sample means of A^4 
and t a- This strategy is valid as long as the running time is long enough to allow for the system 
to be close to stationarity. Moreover, this strategy is basically the same as merely outputting the 
number of loss events divided by the run time times As (excluding the uncompleted last j4-cycle). 

However, recognizing that loss is a rare event (with exponential decay rate in s as we will show 
as a by-product of our analysis), this method will take an exponential amount of time in s to get 
a specified relative error. This is regardless of the choice of A: if A is large, it takes short time 
to regenerate i.e. r a is small, and consequently the number of losses reported as the numerator 
EaNa of ([2]) is almost always zero; whereas if A is small, it takes a long time to regenerate. In 
order to dramatically speed up the computation time, our strategy is the following. We choose A 
to be a "central limit" set so that Eat a is not exponentially large in s (and not exponentially small 
either; see Section 2.1). This isolates the rarity of loss to the numerator EaNa- In other words, it 
is very difficult for the process to reach overflow in an A-cycle. The key, then, is to construct an 
efficient importance sampling scheme to induce overflow and to estimate the number of losses in 
each A-cycle. 

We point out two practical observations using this approach: First, r a and Na can be estimated 
separately i.e. one can "split" the process every time it hits A: one of which we apply importance 
sampling to get one sample of Na and is then discarded, to the other one we apply the original 
measure to get one sample of r a and also set the initial position for the next j4-cycle (see Asmussen 
and Glynn (2007), Chapter 4). Secondly, to get an estimate of standard deviation one has to use 
batch estimates since the samples obtained this way possess serial correlations (Asmussen and Glynn 
(2007), Chapter 4). In other words, one has to divide the simulated chain into several segments 
of equal number of time units. Then an estimate of the steady-state loss probability is computed 
from each chain segment. These estimates are regarded as independent samples of loss probability. 
The details of batch sampling will be provided in Section 5 when we discuss numerical results. 

We summarize our approach as follows: 

Algorithm 1 

1. Choose a recurrent set A. Initialize the GI/G/s queue's status as any point in A. 

2. Run the queue. Each time the queue hits a point in A, say x, do the following: Starting from 

x, 

(a) Use importance sampling to sample one Na, the number of loss in a cycle. 

(b) Use crude Monte Carlo to sample one t a, the return time. The final position of this 
queue is taken as the new x. 
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3. Divide the queue into several segments of equal time length. Compute the estimate of steady- 
state loss probability using the batch samples. 



The main result of this paper is the construction and the asymptotic optimality proof of an 
efficient importance sampling scheme together. In order to show the optimality of the algorithm, on 
our way, we obtain large deviations asymptotics for loss probabilities that might be of independent 
interest. 



Theorem 1. The estimator using the recurrent set A in (10) and the importance sampler given by 
Algorithm 2 is asymptotically optimal. Moreover, the steady-state loss probability ^ can be seen 
to be exponentially decaying in s with decay rate I* defined in (19). 



An important novel feature of the problem we consider (and our solution) is that it requires a 
construction based on full measure- valued processes. Intuitively, the steady-state loss probability of 
the GI/G/s system depends on its loss behavior starting from a "normal" or "typical" state under 
stationarity (which comes from a diffusion limit). It turns out that the loss behavior can vary 
substantially if one defines this initial "normal" state only through the system's queue length (even 
though loss event is defined only through the queue length). However, by defining the "normal" 
state through the whole description of the system (which requires a measure) the loss behavior 
starting from this measure-valued state is characterized by a natural optimal path in the large 
deviations sense, and as a result we can identify the efficient importance sampling scheme to induce 
such losses. These observations ultimately translate to the need of a measure- valued recurrent set 
A in the simulation of EaNa in Q. 

We next point out two further methodological observations. First, our importance sampling 
algorithm utilizes the representation of a (coupled) GI/G/oo as a point process. This point pro- 
cess representation, we believe, can also be used to prove results on sample path large deviations 
for many-server systems; such development will be reported in Blanchet, Chen and Lam (2012). 
Secondly, our algorithm requires essentially the information of the whole sample path of the system 
due to a randomization of time horizon, in contrast to the algorithm proposed in Szechtman and 
Glynn (2002) for estimating fixed-time probability. 



Finally, the recurrent set A, given by (10), can be seen to possess the following properties: 
Proposition 1. In the GI/G/s system, 

1 



and 



for any p > 0. 



lim - log E a t p a = (3) 

s—>oo s 



lim sup - log E A N P A < (4) 

s— >oo S 



Briefly stated, Proposition [T] stipulates that any moments of the time length and number of 
losses of an A-cycle are subexponential in s. When p = 1, it in particular states that the expected 
time length of a cycle is subexponential in s. As discussed above, this isolates the rarity of loss to 
the numerator in ^ and ensures the validity of Algorithm 1. The result on general p in Proposition 
[T] is also used in the optimality proof of the importance sampling (as will be seen in Section 3). 
Interestingly, the proof of Proposition [T] requires the use of the Borell-TIS inequality for Gaussian 
random fields. The connection to Gaussian random fields arises in the diffusion limit of the coupled 
GI/G/oo queue. 
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1.2 Assumptions on Arrivals and Service Time Distribution 

We now state the assumptions of our model, namely a GI/G/s loss system. There are s > 1 
servers in the system. We assume arrivals follow a renewal process with rate As i.e. the interarrival 
times are i.i.d. with mean l/(As). More precisely, we introduce a "base" arrival system, with 
N°(t),t > as its counting process of the arrivals from time to t, and 1/2, k = 0, 1, 2, . . . as the 
i.i.d. interarrival times with EU® = 1/A (except the first arrival Uq, which can be delayed). We 
then scale the system so that N s (t) = N°(st) is the counting process of the s-th order system, 
and f/fc = U®/s,k = 0, 1, 2, . . . are the interarrival times. Moreover, we let A^, k = 1, 2, . . . be the 
arrival times i.e. A% = YliZo &i (note the convention 17% = ^4fc+i — A% and Aq = 0). Note that for 
convenience we have suppressed the dependence on s in [/j and A%. 

We assume that Uk has exponential moments in a neighborhood of the origin, and let K s (9) = 
log Ee k be the logarithmic moment generating function of U%. It is easy to see that k s (9) = k°(9/s) 
where k°(9) = \ogEe eu k is the logarithmic moment generating function of the interarrival time in 
the base system. 

Since is increasing, we can let 

^ n {9) = -{k»)-\-9) (5) 

where (k°) 1 (•) is the inverse of K°(-). Note that k~ 1 (0) = s (k°) 1 (9). Also, V'jv(') i s increasing 
and convex; this is inherited from k°(-)- 

Now we impose a few assumptions on ip N (-). First, we assume Dom ip N D IR + (that Dom ip N D 
M_ is obvious from the definition of ip N (-)), and hence Dom ip N = IR. We also assume that ip^(-) is 
twice continuously differentiable on M, strictly convex and steep on the positive side i.e. ip' N {9) /*• oo 
as 9 /* oo. Thus x/j' n (0) = A and ip N = [A, oo). Finally, we insist the technical condition 

0^1og^v(0)^oo (6) 

as 9 /* oo. This condition is satisfied by many common interarrival distributions, such as exponen- 
tial, Gamma, Erlang etc. (Its use is in Lemma 4 as a regularity condition to prevent the blow-up 
of likelihood ratio due to sample paths that hit overflow very early) . 

Under these assumptions we have for any = to < ti < • • • < t m < oo and 9i, . . . , 9 m G Dom ip N , 

-logEexp lY, 9 i( N ^ - N s&-^ \^J2^N(0iXU-k-i) (7) 
1 1=1 J 1=1 

as s / oo. In particular, tjj N (-)t is the so-called Gartner-Ellis limit of N s (t) for any t > as s /* oo. 
See Glynn and Whitt (1991) and Glynn (1995). In the case of Poisson arrival, for example, the 
interarrival times are exponential and we have k(9) = log(A/ (A — 9)). This gives i^n(9) = \(e e — 1) 
and Dom ip N = R. 

We now state our assumptions on the service times. Denote Vt as the service time of the 
A;-th arriving customer, and let Vk,k = 1,2,... be i.i.d. with distribution function F(-) and tail 
distribution function F(-). We assume that F(-) has a density /(•) that satisfies 

lim yh(y) = oo (8) 

y— ¥oo 

where h(y) = f{y)/F{y) is the hazard rate function (with the convention that h (y) = oo whenever 
F(y) =0). In particular, nSJ) implies that for any p > we can find a > such that yh(y) > p as 
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long as y > a. Hence, 

F(y) = e~ Jo h{u)du < cie - S du = ^ (9) 

for some ci, ca > 0. In other words, F(-) decays faster than any power law. It is worth pointing out 
that assumption Q covers Weibull and log-normal service times, which have been observed to be 
important models in call center analysis (see e.g. Brown et al (2005)). 

Note that service time distribution does not scale with s. Hence the traffic intensity, defined 
by the ratio of arrival rate to service rate, is XEV (we sometimes drop the subscript k of for 
convenience). We assume that XEV < 1. This corresponds to a quality- driven regime and implies 
that loss is rare. We will see the importance of this assumption in our derivation of efficiency and 
large deviations results in Section 3. 

1.3 Representation of System Status 

Let Q(t) be the number of customers in the GI/G/s system at time t. More generally, we let 
Q(t, y) to be the number of customers at time t who have residual service time larger than y, 
where residual service time at time t for the A;-th customer is given by (V& + Aj~ — t) + (defined for 
customers that are not lost). We also keep track of the age process B{t) = inf{t — : A^ < t} 
i.e. the time elapsed since the last arrival. We assume right-continuous sample path i.e. customers 
who arrive at time t and start service are considered to be in the system at time t, while those who 
finish their service at time t are outside the system at time t. We also make the assumption that 
service time is assigned and known upon arrival of each served customer. While not necessarily 
true in practice, this assumption does not alter any output from a simulation point of view as far 
as estimation of loss probabilities is concerned. To insist on a Markov description of the process, 
we let Wt = (Q(t, -),B(t)) € "D[0,oo) x IR + as the state of the process at time t. In the case of 
bounded service time over [0, M\ the state-space is further restricted to T>[0, M] x VL + . 

1.4 A Coupling GI/G/oo System 

As indicated briefly before, in multiple times in this paper we shall use a GI/G/oo system that is 
naturally coupled with the GI/G/s system under the above assumptions. This GI/G/oo system 
has the same arrival process and service time distribution as the GI/G/s system but has infinite 
number of servers and thus no loss can occur. Furthermore, it labels s of its servers from the 
beginning. When customer arrives, he would choose one of the idle labeled servers in preference 
to the rest, and only choose unlabeled server if all the s labeled servers are busy. It is then easy 
to see that the evolution of the GI/G/oo system restricted to the s labeled servers follows exactly 
the same dynamic of the GI/G/s system that we are considering. The purpose of introducing this 
system is to remove the nonlinear "boundary" condition on the queue, hence leading to tractable 
analytical results that we can harness, while the coupling provides a link from this system back to 
the original GI/G/s system. In this paper we shall use the superscript "oo" to denote quantities 
in the GI/G/oo system, so for example Q°° (t) denotes the number of customers at time t for the 
GI/G/oo system, and so on. 

Throughout the paper we also use overline to denote quantities that exclude the initial cus- 
tomers. So for example Q°°(t,y) denotes the number of customers who arrive after time in 
the GI/G/oo system and are present at time t having residual service time larger than y i.e. 
Q°°(t,y) = Q°°(t,y)-Q°°(0,t + y). 



8 



2 Simulation Methodology 

As we have discussed, two key issues in our algorithm are the choice of recurrent set and the 
importance sampling algorithm. We will present them in detail in Section 2.1 and Section 2.2 
respectively. 



2.1 Recurrent Set 

First of all, note that one can pick T = nA for some A > in the definition of loss probability 
given by equation ([!]) and send n — > oo. The introduction of the lattice of size A is useful to define 
return times to the set A only at lattice points. So, let us pick a fixed small time interval A (one 
choice, for example, is say 1/5 of the mean of service time). We choose A to be 

A = {Q{t,y) G J{y) for all y 6 [0,oo), t G {0, A, 2A, ...}}. (10) 

Here J{y) is the interval 

(/•oo poo \ 

As J F{u)du- ^fsC*i{y), \s J F{u)du + VsC*£(y)j (11) 

for some well chosen constant C* > (discussed in Remark 1 below and in Section 4) and 

POO 

£(i/) = Ki/) + 7 I "(u)du (12) 

where 



u(y) = ( A / F{u)du ) (13) 



y 

with any constants ry, 7 > 0. 

The form of J(y) comes from the heavy traffic limit of GI/G/ 00 queue. Pang and Whitt 
(2009) proved the fluid limit Q°°(t,y)/s A J* +y F(u)du a.s. and the diffusion limit {Q°°{t,y) - 

As Jy +y F(u)du) I '-y/s =^> R{t,y) for some Gaussian process R(t,y) on the state space P[0, 00) with 
var(i?(t, y)) — > Ac^ F(u) 2 du+\ F(u)F(u)du as t — > 00, where c a is the coefficient of variation 
of the interarrival times. Our recurrent set A is thus a "confidence band" of the steady state of 
Q°°(t,y), with the width of the confidence band decaying slower than the standard deviation of 
Q°°(oo, •). It can be proved (see Proposition [T]) that this choice of A indeed leads to a return time 
that is subexponential in s. The slower decay rate of the confidence band width is a technical 
adjustment to enlarge A so that a subexponential (in s) return time for the GI/G /oo system is 
guaranteed. In fact, for the case of bounded service time, it suffices to set r\ = 0. 

Remark 1. The interval J(y) contains a non-negative integer for any value of y if C* is chosen 
large enough. In fact, observe that the length of J(y) is continuous and decreasing in y, and let 

l(s) = sup {y>0:V~sC*ay)>\y (14) 

If y is such that the width of J(y) is equal to 1 (equivalently y = l(s)) we have that the center of 
J(y), namely As F{u)du satisfies 

roo 

< As / F{u)du < (A/(<r) 2+ ")( v / ^C*£(y)) 2+ 7s ,?/2 = (A/(C*) 2+ ")(l/2) 2+ 7s ,?/2 . 



9 



The right hand side is less than 1/2 for (C*) 2+ri > A and this implies that {0} C J (y) for y = I (s). 
Now, if y > l(s), we can ensure that the half-width of J{y), namely \/sC*£(y), is larger than the 
center, if C* is chosen sufficiently large. To see this, note that a sufficient condition is that 



POO / POO \ 

As J F{u)du < yfsC* (X J F(u)duj 



V(2+?7) 



which is equivalent to 



/ f-oo \ (l+r?)/(2+r?) 

s 1 ' 2 (J F(u)duj < C*A-( 1+ ")/( 2 +") 



or 



s (l +v /2)/{l+ V ) / p {u)du < (c * ) (2+ 1? )/(l+,) A _l 

Jy 

Now, choosing C* > max (A, 1), we have, for y > l(s), 

POO POO 

s (i+„/2)/(i+„) / p {u)du < S i+V2 / F{u)du < i/(c*) 2 +^l/2) 2 +" < (c*)(2+^)/(i+^) A -i 



which gives the required implication. So {0} C J (y) for y > I (s). Obviously it includes at least one 
point when y < l(s) (because the width of J (y) is larger than 1). Therefore J{y) always contains a 
non-negative integer for any y > 0, and the recurrent set A is hence well-defined. 

Remark 2. One may ask whether it is possible to define A in a finite- dimensional fashion, instead 
of introducing the functional "confidence band" in ( |10[ ). For example, one may divide the the 
domain of y into segments [yi,yi + i),i = 0, 1, 2, ... , r(s) — 1 for some integer r(s) with yo = and 
y r ( s ) = oo, where the length of each segment can be dependent on s and non-identical. One then 
define the recurrent set as {Q(t, •) : Q(t, yi) — Q(t, yi+i) £ A{ for i = 0, . . . , r(s) — 1} for some well- 
defined sets A{ 's. As we will see in the arguments in the subsequent sections, the important criteria 
of a good recurrent set is: 1) it consists of a significantly large region in the central limit theorem, 
so that it is visited often enough, 2) its deviation from the mean of Q{t, y) is small, in the sense that 
the distance between any element in this recurrent set and the mean of the steady-state ofQ{t,y), 
at every y S [0, oo), has order o(s). Criterion 2) is important, otherwise the large deviations of loss 
starting from two different elements in the recurrent set can be substantially different. We want to 
avoid having to consider several substantially different paths that can contribute to the loss event in 
a significant way as having such variability would complicate the design of the importance sampling 
estimator. 

Keeping criterion 2) in mind, we conclude that it is important to fine-tune the scale of the 
segments \yi,yi + \) to preserve the efficiency of the algorithm. This suggests that a reasonable de- 
scription of the recurrent set would involve a dimension that grows at a suitable rate as s — >• oo, 
thereby effectively obtaining a set of the form that we propose. The functional definition of A in 



(10) happens to balance both criteria 1) and 2). 



2.2 Simulation Algorithm 

First we shall explain some heuristic in constructing the algorithm. As we discussed earlier, the 
choice of A isolates the rarity of steady-state loss probability to E^Na, which in turn is small because 
of the difficulty in approaching overflow from A. So on an exponential scale, EaNa ~ Pa{ t s < t a). 



10 



where Pa(') is the probability measure with initial state distributed as the steady-state distribution 
conditional on A, and t s = inf{i > : Q(t) > s} is the first passage time to overflow. Observe 
that the probability Pa{t s < ta) is identical for GI/G/s and the coupled GI/G/oo system since 
the systems are identical before r g . The key idea is to leverage our knowledge of the structurally 
simpler GI/G/oo system. In fact, one can show that the greatest contribution to Pa(t s < ta) is 
the probability Pa{Q°° {t*) > s) for some optimal time t* , whereas the contribution by other times 
is exponentially smaller. 

In view of this heuristic, one may think that the most efficient importance sampling scheme is to 
exponentially tilt the process as if we are interested in estimating the probability Pa{Q°° {t*) > s). 
However, doing so does not guarantee a small "overshoot" of the process at t s . Instead, we introduce 
a randomized time horizon following the idea of Blanchet, Glynn and Lam (2009). The likelihood 
ratio will then comprise of a mixture of individual likelihood ratios under different time horizons, 
and a bound on the overshoot is attained by looking at the right horizon (namely \t s ~\ as explained 
in Section 3). 

Hence our algorithm will take the following steps. Suppose we start from some position in 
A. First we sample a randomized time horizon with some well-chosen distribution. Then we tilt 
the coupled GI/G/oo process to target overflow over this realized time horizon i.e. as if we are 
estimating PA(Q°°(t) > s) for the realized time horizon t. This involves sequential tilting of both 
the arrivals and service times. Once overflow is hit, we switch back to the GI/G/s system, drop the 
lost customers, and change back to the arrival rate and service times under the original measure 
to run the GI/G/s system until A is reached. At this time one sample of Na is recorded together 
with the likelihood ratio. 

The key questions now are: 1) the sequential tilting scheme of arrivals and service times given 
a realized time horizon 2) the distribution of the random time 3) likelihood ratio of this mixture 
scheme. In the following we will explain these ingredients in detail and then lay out our algorithm. 
The proof of efficiency will be deferred to Section 3. 



2.2.1 Sequential Tilting Scheme 

Denote P r (-) and E r [-] as the probability measure and expectation with initial system status r. 
Suppose we want to estimate P r (Q°°(t) > s) efficiently for a GI/G/oo system as s /* oo, where 
r(-) E </(•) C D[0,oo) (so that r(y) is the number of initial customers still in the system at time 
y). An important clue is an invocation of Gartner-Ellis Theorem (see Dembo and Zeitouni (1998)) 
to obtain large deviations result. Although this may not give an immediate importance sampling 
scheme, it can suggest the type of exponential tilting needed that can be verified to be efficient. 
This is proposed by Glynn (1995) and Szechtman and Glynn (2002), which we briefly recall here. 
To be more specific, let us introduce more notations. Let, for any t > 0, 

M0) := f ^7v(log(e e F{t -u)+ F(t - u)))du (15) 
J o 

This is the Gartner-Ellis limit (see for example Dembo and Zeitouni (1998)) of Q°°{t) since 

l-i f Na{t) 1 r* 

- logEe eQOC(t) = - log^exp j 9 ^ I(Vi > t - A*) > ->■ J ip N (log{e 9 F(t - u) + F(t - u)))du 

where /(•) is the indicator function (see Glynn (1995) for a proof. It uses ([7]) and the definition of 
Riemann sum; alternatively, see Lemma [6] in Section 3 as a generalization of this result). Let us 
state the following properties of tp t (-) for later convenience: 
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Lemma 1. Vt(') *s defined on R, toice continuously differentiable, strictly convex and steep. 

Next let a< = 1 — A F(u)du. Note that ats+o(s) is the number of customers needed excluding 
the initial ones to reach overflow at time t. In other words, 

PAQ°°(t) >s) = P(Q°°(t) > a t s + o(s)) (16) 

Now denote Of as the unique positive solution of the equation ip' t (9) = at- Such solution exists 
because V'tO is steep and that at = 1 — A J t °° F(u)du > A Jq F(u)du = t/4(0)- Then under our 
current assumptions Gartner-Ellis Theorem concludes that (1/s) log P r (Q co (t) > s) — > —It where 

It = su V {9a t - \ t (6)} = e t a t - ^ ((?*) (17) 
em 

It is the so-called rate function of Q°°(t) evaluated at at- 

At this point let us note the following properties of 9t and I t when regarded as functions of t: 

Lemma 2. 6 t satisfies the following: 

1. Ot > is non-increasing in t for all t > 

2. lim^o t = oo 

3. limj^oo 9t = 6oo where 9oo is the unique positive root of the equation •tp' 00 (9) = 1, and 

poo 

Y>oo(0)=/ ^ N {\og{e e F(u)+F{u)))du (18) 
Jo 

Lemma 3. J< satisfies the following: 

1. I t is non-increasing in t for t > 0. 

2. lim^oo I t = inf t>0 It = I* where 

I* = ^00-^00(^00) (19) 

3. If V has bounded support over [0, M], then I* = It for any t > M. 

To construct an implementable efficient importance sampling scheme, one can look at the deriva- 
tive of V t (6): 

m = iy N (^F (t - u) + F (t - n))) m f_ F ^- + ;\ t _ u) iu 

which is the mean of Q°°(t) under the exponential change of measure with parameter 9. When 
9 = 0, V't(O) = Jo ^Ar(0)-^(* ~ u )du = A Jq F(t — u)du. Comparing with ^' t {9t) suggests a build-up 
of the system by accelerating the arrival rate from A to ip' N (log(e dt F(t — u) + F(t — u))) at time 
u and changing the service time distributions such that the probability for an arrival at time u to 
stay in the system at time t is given by e 9t F(t — u)/(e et F(t — u) + F(t — u)). Denote -P'(-) and 
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as the probability measure and expectation under importance sampling. The above changes 
can be achieved by setting an exponential tilting of the i-ih interarrival time Ui by 

P\Ui G dy) 

= expj^- log(e 6t F(t - Ai) + F(t - Ai)))y - K a (Kj\- \og{e 8t F{t - A-} + F(t - A,))))} 
P(Ui G dy) 



-siP N (Log(e 6 tF(t-Ai)+F(t-Ai)))y. 



e 9t F(t - At) + F(t - Ai))P(Ui G dy) 



given the i-th arrival time Ai (recall the convention Ui = — A{), and for an arrival at Ai its 
tilted service time distribution follows 



m 



P\Vi G dy) 



e°tF(t-Ai)+F(t-A t ) 

e Bt f(y) 

e»tF(t-Ai)+F(t-Ai) 



for < y < t - Ai 
for y > t — Ai 



The contribution to likelihood ratio P(-)/P*(-) by each arrival and service time assignment is ac- 
cordingly (using slight abuse of notation) 



and 



P{Ui) e ^(iog(e et ^(*-^i)+^(*-A)))C/i 
PHUi)~ e e tF(t-Ai) + F(t-Ai) 

P{Vi) = eP^F(t - Ai) + F(t - Ai) 

p^m " 



9tI(Vi>t-Ai) 



(20) 



(21) 



We tilt the process using (20) and (21) until the time that we know overflow will happen at time 
t i.e. t A T s [t] where T s [t] = infju > : r(t) + Y^iti^ P\ v i > t ~ A) > s}. The overall likelihood 
ratio on the set Q°°(t) > s will be 



L H e *F(t-Ai) + F(t-Ai) M ffitKy&t-Ai) 

i=i \ / \ / t= i 

{N s (r s [t})-1 N s (r s [t]) 
s Yl Mogie^Fit-AJ + Fit-AiMUi-et I(Vi>t-A l ) 
i=l i=l 

(e^F(t-A Ts[t] ) + F(t-A Ts[t] )) (22) 

This estimator LI(Q°°(t) > s) can be shown to be asymptotically optimal in estimating P r (Q°°(t) > 
s): 

Proposition 2. 

lim sup - log £*[L 2 ;Q°°(t) > s] < -2I t 

Proof. The proof follows from Szechtman and Glynn (2002), but for completeness (and also due 
to our introduction of r s [t] that simplifies the argument in their paper slightly) we shall present it 
here. 

Note that E^" 1 ^ I{Y% >t-Ai) = s + l- r(t) = a t s + o(s) by the definition of r B [t] and r(t). 
Also, e dt F{t - A Ts[t] ) + F{t - A Ts[t] ) < e dt since 9 t > 0. 
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Since ip N is continuous, Y^i^i 1 (^og(e 6t F(t — A{) + F(t — Ai)))Ui is an approximation 
to the Riemann integral /J~ s ^ tp^(log(e et F(t — u) + F(t — u)))du, with intervals defined by = 
Aq < A\ < A2 < ■ ■ ■ < ijy s ( Ts [(]) and within each interval the leftmost function value is used 
as approximation (with the last interval truncated). Since ifj N (log(e et F(t — u) + F(t — u))) is 
non-decreasing in u when 9t > 0, and T s [t] < t on Q°°(t) > s, we have 

N 3 {T 3 [t])-l 

^NQogie^Fit-Ad+Ftt-AiMUi 

i=l 

< / il> N (\og{e dt F{t-u) + F{t-u)))du 



Jo 

< [ i; N (log(e 6t F(t -u) + F(t - u)))du 
Jo 



on Q°°(t) > s. Hence (22) gives 

j2 < e 2si> t {8t)~26t(a t s+o(s)) 

which yields the proposition. □ 
2.2.2 Distribution of Random Horizon 

Denote r as our randomized time horizon. We propose a discrete power-law distribution for r 
independent of the process: 

P(r = T + kS) = v ^- v ^ w for * = 0,X,2... P3) 

where 5 = 6(s) = c/s for some constant c > 0. The power-law distribution of r is to avoid 
exponential contribution from the mixture probability to the likelihood ratio that may disturb 
algorithmic efficiency. Notice that we use a power law of order 2, and in fact we can choose any 
power law distribution (with finite mean so that it does not take long time to generate the process 
up to r). 

T is a constant to avoid tilting the process on a time horizon too close to 0, otherwise likelihood 
ratio would blow up for paths that hit overflow very early (because of the fact that lim^o 0t = 00 
in Lemma^Part 1; see also Section 3). A good choice of T is the following. Let It = sup eeR {6'(l — 
\EV)-ip N (6)t} = 0t(l — \EV)—ip N (6 t )t where 6 t is the solution to the equation ip' N (6)t = l-XEV 
(which exists by the steepness assumption for small enough t). This is the rate function of N s (t) 
evaluated at 1 — XEV. 

We choose < T < 00 that satisfies 

I T > 21* (24) 

which always exists by the following lemma: 
Lemma 4. It satisfies the following: 

1. It is non-increasing in t for t < rj for some small r] > 0. 

2. I t ->■ 00 as t \ 0. 
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Remark 3. In fact by looking at the arguments in the next section, one can see that 5 being merely 
o(l) leads to asymptotic optimality. However, the coarser the 5, the larger is the subexponential 
factor beside the exponential decay component in the variance, with the extreme that when 5 is 
order 1, asymptotic optimality no longer holds. The choice of 5 = c/s is found to perform well 
empirically, as illustrated in Section 6. 



2.2.3 Likelihood Ratio 

After sampling the randomized time horizon, we accelerate the process using the sequential tilting 



scheme (20) and (21) with a realized r = t. But since we are now interested in the first passage 



probability, we tilt the process until t A t s A ta (rather than t s [t] defined above). If t A r s < r^, we 
continue the GI/G/s system under the original measure. Also, to prevent a blow-up of likelihood 
ratio close to t = 0, we use the original measure throughout the whole process whenever r = T 
(see the proof of efficiency next section). Now denote E[-] and P (•) as the importance sampling 
measure. We have 

oo 

P{W U , 0<u<t s At a ) = J2p(t = T + k5)P T+kS (W u , < u < t s A r A ) 

k=0 

(with P T (-) = P{-))- So the overall likelihood ratio L = L(W.) on the set t s < ta is given by 

dP P(W u ,0<u<t s ) 



L 



dP ET=o p ( T = T + k $)P T+kS (Wu,0<u<T s 
1 



(25) 



kd 



where L t = L t {W.) is the individual likelihood ratio as a sequential product of (20) and (21) up to 
i At, i.e. 



' exp { S ES Ts)_1 V7v(log(e et F(t - A-) + Fit - Ai)))Ui - 9 t ES TsM I(Vi > t- At)} 

for t > t s 

exp {s E?=i yi V^0og(e et F(i " A) + F(t - AWi - t E 4 =f M UK > t - A t )} 

for t < t s 



(26) 



for t > T and is 1 for t = T. 



2.2.4 The Algorithm 

We now state our algorithm. Assuming we start from r(-) G J(-) with a given initial age -B(O), do 
the following: 

Algorithm 2 

1. Set Aq = 0. Also initialize Na ^— 0, L <— and r s <— oo. 



2. Sample r according to (23). Say we get a realization r = t. 



3. Simulate Uo according to the initial age -B(O). Set A\ = Uq. Check if ta is reached, in which 
case go to Step 7. 
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4. Starting from i = 1, repeat the following (setting 6 t as the one in (17) for t > T and for 
t = T): 



(a) Generate Vi according to 

P\Vi G dy) := { 



f(y) 

e e tF(t-Ai)+F(t-Ai) 

e 6t f(y) 



for < y < t - A 
for y > t — Ai 



^ e s t F(t— Ai)+F(t— A;) 

(b) Generate C/j according to 

p\Ui G dy) := e -^N(W&Ht-Ai)+nt-*)))v( e OtF( t _ A .) + F ( t _ Ai))P(Ui G dy) 



^4j+i, remove the new arrival at 



(c) Set A i+1 = U l + A i . 

(d) If t a is reached in [Ai, Ai + \), go to Step 7. 

(e) Compute Q°°(A+l)- If Q°°(A+l) > s then set r s 
A+i> update Na + 1, and go to Step 5. 

(f) If Ai+i > t, go to Step 5. 

(g) Update + 

5. Repeat the following: 

(a) Generate Vi and Ui under the original measure. Set A+i = Ui + A{. 

(b) If t a is reached in [Ai, Ai + \), go to Step 6. 

(c) Compute Q(Aj + i). This includes the removal of new arrival A- l+ \ from the system in 
case it is a loss; in such case update Na <— Na + 1, and set r s <— A+i if in addition that 

T s = OO. 

(d) Update i i + 1. 



6. Compute LI(t s < r^) using (25) and (26). 

7. Output N a LI(t s < t a ). 



3 Algorithmic Efficiency 

In this section we will prove asymptotic optimality of the estimator outputted by Algorithm 2. To 
be more precise, we will identify I* defined in (19) as the exponential decay rate of EaNa- The 
key result is the following: 

Theorem 2. The second moment of the estimator in Algorithm 2 satisfies 

lim sup- log i? r [iViL 2 ;T s < t a ] < -21* 

s— >oo S 

for any r(-) G </(•)■ 
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This result, together with Theorem [3] in the sequel, will expose a loop of inequality that leads 
to asymptotic optimality and large deviations asymptotic simultaneously. The main technicality of 
this result is an estimate of the continuity of the likelihood ratio, or intuitively the "overshoot" at 
the time of loss. It draws upon a two-dimensional point process description of the system, in which 
the geometry of the process plays an important role in estimating this "overshoot" . 

Proof. Denote \x] = min{T + kS, k = 0, 1,... : x < T + k5}. Also recall the definition at = 
1 - A f t °° F(u)du. 



Consider the likelihood ratio in (|25|) 
LI(t s < r A 



1 



ET= P(r = T + k5)L 



— I(t s < T A ) < 



L 



T+k8 



P(r=\r s ]) 

N s (r s )-1 



I(t s < T A ) 



P(T = T)- 1 I(T s <T;r s <T A ) + P(T=\T s ])- 1 expL ^N^og(e 8 ^F(\r s ]-A 

I i=i 



N s {t s )-1 



+ F{\T s \-A i )))U i -9 lTa] Y, I(Vi> \r s ]-A i )\l(T s >T;r s <T A ) 



8=1 



C 2 T 



< CxI(t s <T;t s < t a ) + — exp 



I(t s >T;t s < t a ) 



< C 1 I(t s <T;t s <t a ) + 



I(t s >T;t s < ta) 



{ S ^ rTsl (^ Tsl ) - fl W (Q°°(r Sl \r a ] - t s ) - 1)} 
exp J - si* + 9 lTs] I sa lTs] + 1 - Q°°(r s , \t s ] - t s ) 



where C\ and C2 are positive constants. Note that the second inequality comes from the fact 
that ES Ts)_1 ^iv(log(e e r- s lF([r s l - A { ) + F([t s ] - Ai)))Ui is a Riemann sum of the integral 
^r Ts -|(0|- Ts -|) = Jq T ^ ip N (log(e 6 'l" T si F(\t s ] —u) +F(\t s ] —u)))du (excluding the intervals at the two 
ends) and that ^^(^(e^ -|F([r s ] —u) + F(\t s ~\ —u))) is a non-decreasing function in u. Also note 

that Yli^i'^ > \ T s] — Ai) = Q°°(t s , \t s ] — t s ) is the number of customers who arrive before 
t s and leave after [Y s ] . The last inequality follows from the definition of /[ Ts ] and Lemma |3] Part 
2. Now we have 



E r [N A L ; t s < t a ] = E r [N A L; t s < t a ] 



< dErlNl; t s <T;t s < t a ] + ^e~ sr ' E r 



Co 



si* 



t s >T;t s < t a 



N 2 A T 3 s exp{e lTs] ( S a lTs] +l-Q°°(T s ,\ Ts ]-T s ))} 



(27) 



Consider the first summand. By Holder's inequality E r [N\; t s <T;t s < t a ] < (E r [N 2 A p ]) 1 l' p (P r {T s < 
T)) 1 /" for l/p+l/q = 1. Also, P t {t s <T)< P(N s (T) > s-r(T)) < P{N S (T) > s(l - XEV) + o{s)) 
and a straightforward invocation of Gartner-Ellis Theorem yields limg^oo -logP(A^ s (T) > s(l — 



XEV) + o(s)) = —It < —21* by our choice of T in (24). Combining these observations, and using 
Lemma [TJ we get 

limsup - log E r [N\\ t s <T;t s < t a ] < limsup — log£; r [^ p ] + limsup — log P r {r s < T) < -21* 

s— >oo S s—>oo Sp s— >oo SQ 
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for q close enough to 1. 



In view of (27) and Dembo and Zeitouni (1998) Lemma 1.2.15, the proof will be complete once 



we can prove that 



lim sup - log E r 

s— >oo S 



N%T s 3 exp{9 lTs] {sa lTg] + 1 - Q°°(t s , \t s ] -t s ))};t s > T;t s <t a 



< -I* 
(28) 



To this end, we write 



Er 



iV|T s 3 exp{e rTsl (sa [Tsl +1-Q°°(r s , \t s ] -t s ))};t s > T;t s < r A 
iY^r/expi^ 



s + l-Xs I F(u)du - Q°°(t s , \t s ] - T S 



;t s >T;t s < t a 



< e CdT ^ s E r 



Nlr 3 s exp{e lTs] (s + l-rCKD-Q 00 ^,^] - r s ))} ;t s > T;r s < r A 



5> 



k=l 



N 2 A T 3 s exp{6 lTs] ( s + l-r(\T s -})-Q°°(T s ,\T s ] - t s ))} ;\t s ] =T + k5; 



t a >T + (k-l)S 

oo 

< e ce T VS ^(£ r Ar2p)Vp(£ rT ^)i/9(p r ( rA > r + (A; — 1)5))^ 

k=l 

(E r [exp {W T+kS (s + 1 - r(T + kS) - Q°°(t s , T + kS - r s )) } ; T + (k - 1)5 < r s < T + kS] ) 1/1 

oo 

= Y,( E r N A) 1/P ( E rTA 3<1 ) 1/q {Pr(TA > T + (k - l)6))^ h 

k=l 

{E r [exp {W T +kS {s + l- r{r s ) - Q°°(r s , T + k5 - t s )) } ; T + {k - 1)5 < t s < T + k5]) 1/l (29) 

where C is a positive constant and 1/p + 1/q + 1/h + l/l = 1. The first inequality follows from 
the fact that r(-) £ </(•) and Lemma [3] Part 1 while the second inequality follows from generalized 
Holder's inequality. The last equality holds because r(r s ) — r(T +k5) = o(s), again since r(-) G J(-), 
for T + (k - 1)5 < t s < T + jW. 
We now analyze 

E r [exp{W T+k5 {s + l-r(T s )-Q°°{T s ,T + k6-T s ))};T + {k-l)5<T s <T + M] (30) 

We plot the arrivals on a two-dimensional plane, with x-axis indicating the time of arrival and 
y-axis indicating the assigned service time at the time of arrival. Such plot has been used in the 
study of M/G/oo system (see for example Foley (1982)). In this representation it is easy to see 
that the departure time of an arriving customer is the 45° projection of the point onto the x-axis. 
As a result, Q°°(t) for example, will be the number of all the points inside the triangular simplex 
created by a vertical line and a downward 45° line joining at the point (t, 0). See Figure 1. 

For notational convenience we denote 0t^t 2 [*3> ^] := Ylf=N (ti)+l < K < t^—Aj) as the 

number of customers in the GI/G/oo system who arrive sometime in (ti, ^2] and leave the system 
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>T + kS-A i ) 



Arrival time 



J 4 i _|_ y i time of the 

customer at t 



Figure 1 Figure 2 



G k := Q^iT +(k- 1)5) + N S (T + fcfi) - A^p" + (A: - 1)S) = 
number of points in this inverted trapezoid 



Assigned 
service time 
at arrival 




Hk ■= CSj+mIT + (* - l)5,r + kS] = number of 
points in this strip 



Assigned 
service time 
at arrival 



c» 00 = Q" r+M [r + C* - 1)5, °°] 

Sj 00 = Co" f + Pf- 1)5, co] ^ 



I' = / z(k,S)T + tyc — l)§ T +k6 

»{W = ff&P" + * - 1)5- 1" + K] 

«IW = QZtmV + (* - D5- 1 - + kS] 



Figure 3 



Figure 4 



sometime in (ts, t^]. It is easy to see, for example, that Q°°(t s , T + k5 — t s ) = Q^ Ta [T + kS, oo] for 
T + kS> t s . 

Figure 2 shows the region filled in by Q°° (r s , T + k5 — t s ) = Qq° t [T + k5, oo] as a shifted simplex 
starting from the point (r s ,T + kb — t s ). Note that by definition Q°°(t s ) = s + 1 — r(r s ), and so 
s+1— r(r s ) — Qo° r [T+kd, oo] corresponds to the downward strip ending at (r s ,0) and (T Sl T + k5 — 
t s ), which is obviously smaller than the region represented by H k := Q^x+ksi^ + — 1)<^ T + kS] 
in Figure 3. 

Define G k = Q°°{T + (Jfe - 1)5) + N S (T + kS) - N S (T + (k - 1)5), which is represented by the 
trapezoidal area depicted in Figure 3. Observe that T + (k — 1)5 < t s <T + k5 implies that one 
of the triangular simplex corresponding to Q°°(t), for T + (k — 1)5 < t < T + kS, has number of 
points larger than s — r(T + (k — 1)5). This in turn implies that the region represented by G k has 
more than s — r(T + (k — 1)5) number of points. 

The above observations lead to 

E r [exp{W T+kS (s + 1 - r(r s ) - Q^ Tg [T + k5, oo])}; T + (k - 1)5 < t s < T + kS] 
< E r [e WT + ksHk ;G k > s-r(T + (k - 1)5)] (31) 

From now on we focus on the case when service time has unbounded support (the bounded 
support case is simpler and will be presented later in the proof). We introduce a time point 
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z = z(k, s) and consider the divisions of areas represented by Hk and Gk in Figure 4: 



Hl(z) := Q^ Z [T + (k— 1)5, T + k5] C G\{z) := Q~ [T + (k — 1)5, oo] 
H 2 (z):=Q™ T+k5 [T+(k-l)5,T + k5} C G 2 k (z) := Q™ T+k5 [T + (fc - 1)5, oo] 

Note that ff fe = fl£(z) + #£(z) and G k = G\{z) + Gg(z). 

Moreover, define A\, i = 1, . . . , Gk to be the arrival times of all the customers that Gk is counting. 
Note that given the arrival times Af, i = 1, . . . , Gk, the events whether each of these customers falls 
into H k are independent Bernoulli random variables with probability 

F(T + (k — 1)5 - A\) - F(T + k5 - A$) 



F(T + (k— 1)5 - A*) 



(32) 



Hence we can write (31) as 



< R 



E r [ e w T+ks( H U*)+ H2 kW);G k >s-r(T+(k- 1)5)} 
E r [E r [e WT + k ^ H k^+ H k( z ^\A^i = 1, . . . , G k ];G k > s — r(T + (k — 1)5)} 
E r [E r [e w ^ H l^\Ati= 1, . . . , G 1 k (z)]E r [e ie ™ H ZM\A*,i = G\{z) + 1, . . . , G\(z) + G 2 (z)}; 
G 1 k(z) + G 2 k (z)> S -r(T+(k-l)5)} 

Gl(z)+Gl(z) 

e w T+kS Gl(z) Yl (1 + (e ldT + kS - l)p*);Gl(z) + G 2 k (z) > s - r{T + (k - 1)5) 

i=G\(z)+l 



(33) 



Let 



/ \ k C5 

Pk(z) := sup < 

1 ' Ai> z ~ F(T + k5-z) 



(34) 



for some constant C > 0, where the inequality follows from (32). Also let 
^ k (9):= log Ee 0G lM 

JO 
r T+kS 

i> N {\og{e°F(T + {k- 1)5 - u) + F{T + (k - 1)5 - u)))du + o(s) 



^ k {9) := log Ee 6G l^ = s J 



i; N (log(e e F(T + (k - 1)5 - u) + F(T + (k - 1)5 - u)))du + o(s) 

T+kS 



where o(s) is uniform in 9, k and z. This is due to the following lemma, whose proof will be deferred 
to the appendix: 



Lemma 5. We have 
1 



logEe 9 ^^ 



4> N (log(e e F(t -u) + F(t - u)))du 



uniformly over 9 € [6oo, 9t}, t >T and 0<w<z<t + n for any r) > 0. 
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When pk(z) is small enough, (33) is less than or equal to 

= ^[^[eWT+M^W+iogd+^T+w-Dp^,))^^). G 2 {z) >s _ r{T+{k _ l)6) _ G i (z) | G i (2)j B{z)]] 
< E r [e W {W T+kS Gl(z) - e T+[k _ 1)s (s - r(T + (k - 1)5) - G\(z)) 
+ Vi, jfc (log(l + (e^ - l)p fc (z)) + T+ ( fc -i)*)}] 

= exp |^, z ,fc(^T+fc5 + 0T+{k-i)s) - &T+(k-i)s( s ~ r ( T +{k- 1)5)) 
+ < z , fc (log(l + (e w ^ - l)p k (z)) + 6 T+{k _ 1)s )\ 

= exp js j ip N (log(e WT+kS+dT +( k ~v s F(T + (k - 1)5 - u) + F(T + (k — 1)5 - u)))du 

- s ^(^(e^^+^+'^^^^'+^+f^ 1 )^^ + (k - 1)5 -u) + F(T + {k- 1)5 - u)))du 

Jo 

- e T+{k _ 1)S (s - r(T + (k — 1)5)) + s^ r +(fc-i)*(log(l + (e Wr+kS - l)Pk(z)) + T +(k-i)s) 

+ o( S )| (35) 

where the inequality follows by Chernoff's inequality, and the last equality follows from 
i>lzA 9 ) = s ^T+(k-l)6(0) ~ « f V^(log(e e F(T + (k — 1)5 - u) + F(T + (k - 1)5 - u)))du + o(s) 



o 



uniformly, by Lemma [5} 

Now let p s /* oo be a sequence satisfying sF(p s ) /*■ oo, whose existence is guaranteed by the 
unbounded support assumption. We divide into two cases: For T + (k — 1)5 < p s , we put z = 



and so by (34) and we have p k (0) \ as s /* oo (recall 5 = 0(l/s)). Consequently (35) becomes 



ex V {-e T+{k _ 1)s ( S -r(T+(k-l)5))+s^ T+(k _ 1)S (^ 

For T + (k — 1)5 > p s , we put z = T + (k — 1)5 — p s so that T + (k — 1)5 — z = p s . Hence again 
Pk(z) \ 0. Also, 

I ^ N (log{e WT + kS+9T +( k -^ s F(T +{k- 1)5 - u) + F(T + (k - 1)5 - u)))du 
Jo 

rT+(k-l)S 

= / ■tp N (log(e ieT + k5+eT +( k -^F{u) + F(u)))du 

POO 

< / Ci\(e WT + kS+0T +^ s - l)F(u)du 

JT+(k-l)S-z 



OO 



C 2 \ / F(u)du = o(l) 



Ps 



for large enough T+(k — 1)5— z = p s and some constants C\, Ci > 0, due to the fact that log(l+x) < 
x for x > and that V'jv(O) = ^- It is now obvious that (35) also becomes e~ sIr+(k ~ 1 '> s+0 ^ in this 
case. 
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Hence ( 29 ) is less than or equal to 



-sI*/l+o(s) 



Y,(E r N 2 V) l lv(E r T A ^)V\P r (T A >T + (k- 1)5)) 



l/h 



k=l 



< e-^/^^iErN^Y/PiErTA^) 1 ^ ((P r (T A > T)) l l h + - j°°(P r (T A > u))V h du 
From this, and using Lemma [TJ we get 



lim sup - log E r 

s— >oo S 



NlT s 2 exp{6 lTs] (sa lTs] + 1 - 0°°(r s , [r,] - r s ))} ;t s > T;t s < t a 



< 



I* 



Since I is arbitrarily close to 1 , we have proved ( 28 ) 



Finally, we consider the case when V has bounded support over [0, M\. Pick a small constant 
a > 0, and consider the set of customers G k = Q(T+(k-i)8-M)\/o,T+ks[T+(k— 1)5— a, oo] that consists 
of G k and a trapezoidal strip of width a running through (T + (k — 1)5 — a, 0), (T + (k — 1)5, 0), 
((T + (k— 1)5 - M) V 0, M A (T + (k - 1)5)) and ((T + (k - 1)5 - M) V 0, M A (T + (fc - 1)5) - a). 
See Figure 5. 



Assigned 
service time 
at arrival 

M 




T + O - 1)5 T + fa5 



Figure 5 



Denote A\ , i = 1 , . . . , as the arrival times of customers falling in G k . Then we have 

E r [e WT + ksHk ;G k > s - r(T + (k - 1)5)} 
< E r [e WT + ksHk ;G k > s - r(T + (k - 1)5)} 
= E r [E r [e w ^ Hk \At i = 1, . . . ,G k ];G k > s — r(T + (k — 1)5)} 

~G k 

+ (e w ^)pf); G k >s-r(T + (k- 1)5) 



where 



Pi 



E r 



8=1 



F(T + (k- 1)5 - if) - F(T + k5- if) ^ _ , 

— < Pk ■■= sup pf < 



C5 



F(T + (k-l)5-a- if) 



i=l,...,G fc 



F(M — a) 



(36) 



22 



Hence ( 36 ) is less than or equal to 

Er [ e log(i+(e l9T + kS )Pk)G k . Q k > s _ r ( T + ( k _ ^ 

< e -e T+(fc _ 1)5 (s-r(T+(fc-l)5))+^ fe (log(l+( e i9 T+ fe ^i)p) +9T+(fc _ 1)5 ) 

where 4> k (6) '■= \ogEe eGk , by Chernoff's inequality. Now note that by Lemma^we have 



(37) 



MO) = s 



T+kS 

ip N (log(e e F(T + (k — 1)5 — a — u) + F(T + (jfe - 1)5 - a - u)))du + o(s) 

(T+(fc-l)<5-M)V0 
(M-a)A(T+(fc-l)<S-a) 

ip N (log(e e F(u) + F(u)))du + s^ N {0){a + 5) + o(s) 



< sip T+ik _ 1)s (9) + saC + o(s) 



for some constant C > 0, uniformly in 6 and /c. Hence (37) is less than or equal to 

e - e T+(k-l)s( s - 1 i T +{ k - 1 ) S ))+ s ^T+(k-l)s( 9 T+(k~l)s)+ sa C+°( s ) 
- p -sI T+{k _ 1)s +saC+o(s) 



Thus ( 29 ) is less than or equal to 

oo 

e -si*/i+saC/i+o(s) Y j (E r N 2 ]') 1 l p (E T T A Zq ) 1 l q {P r {T A >T + (k- l)5)) l/h 



k=l 



This gives 



lim sup - log E r 

s— >oo S 



iY|r s 3 exp{e M (sa lTs] + 1 - Q°°(r s , \r s ] -t s ))};t s >T;t s <t a 



I* aC 



Since / and a can be chosen arbitrarily close to 1 and respectively, (28) holds and conclusion 
follows. 

□ 

Remark 4. The proof can be simplified in the case of M/G/s system. In particular, there is no 
need to condition on Af nor introduce the constant a in the case of bounded support V. Since 
arrival is Poisson, the two-dimensional description of arrivals via the arrival time and the required 
service time at the time of arrival leads to a Poisson random measure. Hence all the points in G k 
are independently sampled, each with probability of falling into H k being 



j^ +kS (F(T + (k— 1)5 - u) - F(T + k5- u))d 



Pk 



U 



< 



C5{M + 5) 



f 

Jo 



T+(k-l)5 



F(u)du + N s ((k-l)5,k5) 



0(5) 



for some constant C > 0. Then (30) immediately becomes 



E r [(p k e WT + kS + 1 - p k ) Gk ;G k >s-r(T+(k- 1)5)] 
E r [e°^ Gk ;G k > s - r(T + (k - 1)5)] 



The rest follows similarly as in the proof. 
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Remark 5. Note that the result coincides with Erlang's loss formula in the case of M/G/s (see 
for example Asmussen (2003)), which states that the loss probability is exactly given by 

- {XsEVYI ' [ 



1 + XsEV + • • • + (XsEV) s I s! 
Simple calculation reveals that (1/s) log P n (loss) — > log(A-EV) + 1 — XEV = —I*. 

The next result we will discuss is the lower bound: 
Theorem 3. For any r(-) G J(-), we have 

liminf - logP r (r s < ta) > —I* 

s— »oo s 



It suffices to prove that lim inf s _ >00 ( 1/s) log P r (r s < ta) > —It„ for a sequence t n /* oo thanks 
to Lemma [3] Part 1 and 2. In fact we will take t n = nA. In the case of bounded support V, it suffices 
to only consider nA = \M] because of Lemma [3] Part 3. For each nA, the idea then is to identify 
a so-called optimal sample path (or more precisely a neighborhood of such path) that possesses a 
rate function J n A and has the property r s < t a- Note that the probability in consideration is the 
same for GI/G/s and GI/G /oo systems. Henceforth we would consider paths in GI/G /oo. 

The way we define A in (10) implies that it suffices to focus on the process on the time- 
grid {0,A,2A, ...} for checking the condition t s < ta- For a path to reach s at time nA, 
the form of < a (0„a) hints that E[Qft_ 1)A>kA [(j - 1)A, jA]|Q°°(nA) > s] = sa kj + o(s) and 
MG(fe-i)A,*Aln A ' oo]|Q°°(nA) > s] = s(3 k + o(s) where 

f kA if /i / ft a 7—1 / a s , . ...FO'A-ii)-F(O'-l)A-u) , 

a kj := / i>' N log (e e **F nA - u + F(nA - u u ' ^ j 

J(k-i)A e 0nA F(nA — u) + F(nA — u) 



and 



f kA n - e 9nA F(nA - v) 

:= / ^(log(e e -F(nA - u) + F(nA - «))) 6 ^ -d« 

J(k-i)A e w ™ A F(nA — u) + F(nA — u) 



'(Jfe-l)A 

for = 1, . . . ,n, j = k, . . . ,n. Our goal is to rigorously justify that such a path is the optimal 
sample path discussed above. 

We now state two useful lemmas. The first is a generalization of Glynn (1995), whose proof 
resembles this earlier work and is deferred to the appendix. The second one argues that the path 
we identified indeed satisfies t s < ta- 

Lemma 6. Let = {9 k j,0k-)k=l,...,n,j=k,...,n G E n ( w + 1 )/ 2 + n ; and define 



^(0) = V / ^ log r e e ^P((j - 1)A - u < V < jA - u) + e Bk F(nA -u)\ du 



We have 
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Lemma 7. Starting with any r(-) G </(•), i/te sample path with Q^L^a fcAtC? ~~ 1) A., j^X] € ((afcj + 
7fcj) s > (afej +e)s) ; Q^_ 1)AifcA ["-A,oo] G + 7 fc )s, (/3 fc + e)s) for all k = 1, . . . ,n and j = k, . . . ,n 
satisfies t s < t a . Here J kj ,j k > 0, Efc=i,...,n 7fej + £fc=i,...,„7* = 7<ooon<ie> 7 fcj ,e > 7 fc . 

j=k,...,n 

Proof. For £ = 1, . . . , n, consider 
l 

Q°°(IA) = E^-DA, fe A[^,oo] 
k=\ 

l I n \ I I n 

> E E ak i s + bkS + E E 7kj s + 7fc« 
fc=i \i=/+i / fe=i \i=/+i 

= E T A ^(log(e e -F(nA - u) + F(nA - u))) ^ " " } " ^ ~ 1} f = " } du 

/" fcA fl e e ™ A F(nA - u) \ 

J(k-i)A ;; V-AF(nA-u) + F(nA- U ) J 

i / n 

+ S E E 7fci + 7fc 
fe=l \j=Z+l 



»/ o «>V(log( £ "-F(„A - „) + F(„A - „,,) U^(nA-u) + F (n A- „) 



I I n 
+ S E E 7fcj+7fc 

fe=i y=/+i 

WA 

> As/ F(/A-n)dM + CiVs 

JO 

for any given constant Ci, when s is large enough. The last inequality follows from the monotonicity 
of tpN- Note that we then have Q°°(IA) = Q°°(IA) +r(lA) > As + C^a/s for any given constant C*2 
and large enough s. Hence t a is not reached in time nA when s is large. 
On the other hand, 



Q°°(nA) = EQ£_ 1)A)fcA [nA,oo] 
fc=i 

n n 

> E^ s + E 7 fc s 
fc=i fe=i 

™ rkA e e " A F(nA-u) A 

= *E / Wlog(e e "*F(nA - M ) + F(nA - u))) g 7 p7 a \ T F / A + 

J{k-i)A e y «A j p(nA - it) + F(nA - u) 

= .jf ^ (1 o g (^F(„A - u) + F(„A - M ))) ^ f (bA _ L - _ + ,g 7t 

n 

= sV'nA^nA) +«E 7 fe 



fc=l 
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where the last equality follows from the definition of 9 n &. So Q°°(nA) = Q°°{nA) + r(nA) > s 
when s is large enough. This concludes our proof. 

□ 

We now prove Theorem [3| 
Proof of Theorem^ Note that by Lemma [7| for any r(-) E </(•) and s large enough, 
P r (r s < ta) 

> PAQ$-i)A,kAi(j ~ !)A,iA] G ((ajy + 7 fc >,(ofci + e)a), Q(£-i )A ,fcA[ nA ' °°] G ((0* + 7*)*> Wk + 

fe = l,...,n, j = k,...,n) (38) 

for large enough s given arbitrary 7^, 7^ and e satisfying conditions in Lemma [7j Denote T = 
(7/y,7fc)fe=i,...,n, j=k,...,n- Let 

n n n 

5 r = [J II( a ^ + 7fci» «*j + e) x J]^ + 7fc) /3 fe + e) C K n ( ri+1 )/ 2 + n 

k=lj=k k=l 



Using Gartner-Ellis Theorem for (38) and Lemma |6j we have 

\ lo g p r(Q(£_a) A ,feA[(j ~ 1 ) A '^ A 1 G (( afc i + 7fcj>, + e ) s )> 
^-i^/cA^A 00 ] G ((/3ifc + 7fe)s,(^ + e)s), k = l,...,n, j = k,...,n) 
-»■ -J r (39) 



J(x)= sup {(0,x)-^(0)} 

Jn(n + l)/2 + n 



where Jr = inf x e5 r I(x) and 



with defined in Lemma |6| But note that for k = 1, . . . , n, j = k, . . . ,n, 
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((0,x)-^(0)) = x kj - [ k ip' N ( log [ V e e ^P{{j -l)A-u <V <jA-u) + e dk F(nA -u)\\ 

VDA \ \U jj 

e^P((j-l)A-u<V<jA-u) du 

Y,f =k e e ^P((j -l)A-u<V <jA-u) + e kF(nA-u) 1 ' 
((0,x)-^(0)) = x k - I if)' N [ log | J2e 0k iP((j-l)A-u<V <jA-u) + e 6k F(nA-u) ) ) 
^HnA-u) ^ 

Y^Lk^^PHi -l)A-u<V <jA-u) + e e kF( n A-u) V ; 

Define x* = (a k j, Pk)k=i,...,n, j=k,...,n- For x = x*, it is straightforward to verify that 0* = (0* k j,6* k .) 
where 6* k j = 0, 8* k . = n & for k = 1, . . . ,n, j = k, . . . , n satisfies (40 ) and (41 ). Since (0, x) — ip(@) 



d 

df k 
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is concave in 0, we have 
7(x*) = (8*,x*)-V>(G*) 

n n „fcA 

= 9nA J2 P k ~ J2 / ^jv(log(F(nA - u) - F((k - 1) A - u) + e e " A F(nA - u)))du 
fc=l ^t' y ( fe - 1 ) A 

= 6 l nA^ / ri A( 6l nA) ~ V>nA (^a) 
= /* 

Now since (B,x) — ifj(Q) is continuously differentiable in and x, by Implicit Function Theorem, 
I(x) is continuous in x. This implies that 

Ir < I(x* + T) I(x*) = r 



as r — )• 0. Together with (38) and (39) gives the conclusion. 

□ 

Theorems [2] and [3] together imply both the asymptotic optimality of Algorithm 2 and the large 
deviations of the loss probability: 

Proof of Theorem^ Note that by Jensen's inequality 

Pr{r s < T A ? < (E r N A ) 2 < E r [N\L 2 ] 

Hence using Theorems |j and |] yields 

-27* < lim 1 log P r {r s < T A f < lim - log(E r N A ) 2 < lim 1 log l5 r [A^L 2 ] < -27* 

Combining Proposition [TJ we conclude that the steady-state loss probability given by ^ decays 
exponentially with rate I* and that Algorithm 2 is asymptotically optimal. □ 



4 Logarithmic Estimate of Return Time 

In this section we will lay out the argument for Proposition [l| The first step is to reduce the 
problem to a GI/G/oo calculation. Define x(t) := sup{y : Q°°(t,y) > 0} as the maximum residual 
service times among all customers present at time t. 

Lemma 8. We have t a < t' a where 

t' a = inf{t G {A, 2A, . . .} : x{t-u) < I, Q°°(w) < s for w G [t-u, t] for some u>l, Q°°(t, •) 6 J(-)} 
for any I > 0. 

Proof. The way we couple the GI/G/oo system implies that at any point of time the number of 
customers in the GI/G/s system is at most that of the coupled GI/G/oo system (in fact the served 
customers in the GI/G/s system is a subset of those in GI/G/oo). Suppose at time t — u we have 
Q°°(t — u) < s and x(t — u) < I. Then Q°°(w) < s for w 6 [t — u,t] means that all the arrivals 
in this interval are not lost i.e. they all get served in both the GI/G/oo and the GI/G/s system. 
Since x(t — u) < I, all the customers present at time t come from arrivals after time t — u. This 
implies that Q(t, •) = Q°°(t, ■). Hence the result of the lemma. □ 
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The next step is to find a mechanism to identify the instant t — u and set an appropriate value 
for I so that t' a is small. We use a geometric trial argument. Divide the time frame into blocks 
separated at To = 0, Ti,T 2 , ... in such a way that (1) a "success" in the block would mean t' a is 
reached before the end of the block (2) {W u ,Ti < u < Ti + {\,i = 0,1,... are roughly independent. 
We then estimate the probability of "success" in a block and also the length of a block to obtain a 
bound for t' a . 

At this point let us also introduce a fixed constant to anci state the following result: 
Lemma 9. For any fixed to > 0. 

i-t+y 



P\Q°°(t,y) G { \* 
and 

p(Q°°(t,y)? [As 



F(u)du ± yfsC\v{y) for all t G [0, t ], y £ [0, oo) 



t+y 



F(u)du ± \fsC\v(y) I for some t G [0, to], y G [0, oo^ 



B(0)j >C 2 >0 

(42) 

B(0)) >C 3 >0 



(43) 

for large enough C\ > and some constants C 2 and C%, all independent of s, uniformly for all 
initial age B(0). v{y) is defined in (13). 

To prove this lemma, the main idea is to consider the diffusion limit of Q°°(t,y) as a two- 
dimensional Gaussian field and then invoke Borell-TIS inequality (Adler (1990)). By Pang and 
Whitt (2009) we know 



Q°°(t,y)-\sf* +y F(u)d 



R(t,y) 



in the space Z?£,r 0jOO )[0, oo), where 

R(t,y)=R 1 (jt,y) + R 2 (t,y) 
is a two-dimensional Gaussian field given by 



and 



ft poo 

R 1 (t,y) = X / I(u + x>t + y)dK(u,x) 
Jo Jo 



R 2 (t, y) = Xc 2 a / F(t + y- u)dW{u) 



(44) 



(45) 



(46) 



where W(-) is a standard Brownian motion, and K(u,x) = W(Xu,F(x)) — F(x)W(Xu, 1) in which 
W(', •) is a standard Brownian sheet on [0, oo) x [0, 1]. W(-) and K (•, •) are independent processes. 
c a is the coefficient of variation i.e. ratio of standard deviation to mean of the interarrival times. 
The key step is then to show an estimate of this limiting Gaussian process: 

Lemma 10. Fix to > 0. For i = 1,2, we have 

P{\R{t,y)\ < C*v{y) for all t G [0,i ], ye [0,oo))>0 



for well-chosen constant C* > 0, where R(-,-) and v{-) are defined in d44j), (45), (46) and (13). 
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This lemma relies on an invocation of Borell-TIS inequality on the Gaussian process Ri(t, y) for 
i = 1,2. The verification of the conditions for such invocation is tedious but routine, and hence will 
be deferred to the appendix. Here we provide a brief outline of the arguments: For i = 1,2, 

Step 1: Define a d- metric (in fact a pseudo- metric) 

di((t, y), (t', y')) = E(Ri(t, y) - k\{t, y)f 

where Ri(t,y) = Ri(t,y)/v(y). Show that the domain [0, to] x [0, oo] can be compactified 
under this (pseudo) metric. 

Step 2: Use an entropy argument (see for example Adler (1990)) to show that Esup s Ri(t, y) < oo. 
In particular, Ri(t,y) is a.s. bounded over S. 

Step 3: Invoke Borell-TIS inequality i.e. for x > E sup 5 Ri(t, y), 

P (supRi(t,y) > x^j < ex P |~^~2 ~ EsupRi(t,y) 

where 

af = sup ERi(t,y) 2 
s 



From these steps, it is straightforward to conclude Lemma 10 The rest of the proof of Lemma 
[9] is to show the uniformity over Uq in the weak limit of Q°° to R. This is done by restricting to the 
set Uq < x for x = 0(1/ s) and using the light tail property of Uq. Again, the derivation is tedious 
but straightforward; the details are provided in the appendix. 

We need one more lemma: 

Lemma 11. Let Vk be r.v. with distribution function F(-) satisfying the light-tail assumption in 
Q. For any p > 0, we have 

E ^max V^j = 0{l p {n) p ) = o(n e ) 

where 

/•oo 

l p (n) = inf{y : np / u p ~ 1 F(u)du < rj} (47) 
Jy 

for a constant rj > and e is any positive number. 
Proof. Let F n (x) = P(maxfc = i ... n V*. > x). Note that 

max Vk) =p u p ~ 1 F n (u)du<y p + np u p-1 F(u)dit 

fc=l,...,n / Jo Jy 



for any y > 0. Pick y = l p (n). Then 



v 

E ( max V k ) = 0{l p {n) p ^ 

k=l,...,n 



Using d9f) we have 0(l p (n) p ) = 0(n e ) for any e > 0. □ 
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We are now ready to prove Proposition [TJ which we need the following construction. Pick 



7 = 1/io where 7 is introduced in (13) and is defined in (12). Recall C\ as in Lemma^l Define 



Ti, i = 0, 1, 2, . . . as follows: Given Tj_i, define 

u(a) = inf jy : ^Ci£(y) < ij 

z = inf {Mo : A; = 1,2,... : kt > v(s) + A} 

Xi = x(Tj_i) 

U7j = infjfcto, A; = 1, 2, ... : kto > Xi} 

di = A N r Ti _ 1+ g i \ + i — (Tj_i + Si) i.e. is the time of first arrival after Tj_i + Si 

Ti = Ti-i + Wi + di + z 

Note that Wi and z are multiples of to- For convenience define, for u < t, Q^(t, y) := Q°°(u + t, y) — 
Q°°(u, t + y) as the number of arrivals after time u that have residual service time larger than y at 
time u + t. We define a "success" in block i to be the event Q that all of the following occurs: 1) 

QSL+ofe-i)*) G ( As / tf t+ *-F(«)d«± V«CiKy)) for a11 * G [o.*o]» for ever y fc = i, 2, . . . , ^/t . 

2) di < c/ s for a small constant c > 0. 3) <3^_ 1+u ,. +di+( - fc _ 1 - ) j 
for all t £ [0, to], for every k = 1, 2, . . . , z/t$. 

Roughly speaking, Q occurs when the GI/G/oo system behaves "normally" for a long enough 
period so that Q°°(t) keeps within capacity for that period and the steady-state confidence band 
J(-) is reached at the end (see the discussion preceding Proposition [I]) . More precisely, starting 
from Tj_i and given x(Tj_i), Tj_i + Wi is the time when all customers in the previous block have 
left. Adjusting for the age at time Tj_i + W{, starting from Tj_i + Wi + di, z is a long enough time 
so that the system would fall into ■/(•) if it behaves normally in each steps of size to throughout the 
period. It can be seen by summing up the interval boundaries that the occurrence of Q ensures t' a 
is reached during the last A units of time before Tj. 

Proof of Proposition [7} We first check that the occurrence of event Q implies that t' a is reached 
during the last A units of time before Tj. As discussed above, since Wi > Xi, all the customers at 
time Tj_i + Wi will be those arrive after time Tj_i. Hence the occurrence of C >i implies that 

Q°°(Ti_ x +Wi,y) 

rkto+y Wi / to 



F{u)du ± y/sCi - l)tb + y) 

-l)to+V k=1 



C [\s J F(u)du ± y/sCi u(y) + — J v(u)du J 

C As / F{u)du ± y/sCtfiy) (48) 



and 



/ rWi+di+y 

Q°°(Ti-i + wi + d h y)£[Xs F{u)du ± v^Ci^di + y) 
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For each t G ({k — l)to, kto], denote [t] = t — (k — l)to, for = 1, ... , z/io- Then 
Q^iTi^ + Wi + di + t.y) 



G As 



tUi +(2; +'(/+< 



c As 



F(u)du + As / F{u)du 

J di+y+t 

Wi/to k 

HU - l)*o + (k + (fc - l)t + [*] + y) + + ^ K(j - 2 )*o + [t] + y)I(k > 1) 

3=1 3=2 

Wi/to+k—1 



F(u)du + As 



t+y 



C As / F(u)du + As 



C As / F(u)du + As 



di+y+t 
Wi+di+y+t 

di+y+t 
Wi+di+y+t 

di+y+t 



F(u)du ± VsCi 



i/((7 - l)t + [t] + y) + v{y) 

3=1 

1 



Mv) + 



t 



v(u)du 



F(u)du ± y/sC'£(y) 



(4 



where C" = 2Ci (which depends on 7). 

It is now obvious that Q implies Q°°(t) < s for [Tj_i + Wj,Tj]. By the definition of v(s), (48) 
and the fact that As F(u)du is smaller and decays faster than y/sC\£(y) for y > v(s) when s is 

large, we get x(Tj_i + wi) < v(s) < z. Let % = sup{A;A : kA < T{\ be the largest time before Tj 
such that A can possibly be hit i.e. in the A-skeleton. It remains to show that Q°°(Ti,y) G J(y) in 
order to conclude that Q implies a hit on t' a . 



From (J49J), for t G [T f _i + «;< + dj, Tj], 



Q°°(t,y)G J^As 
In particular, 



F(u)du — As 



t-T<_i— uij+j/ 



t-Ti-x-Wi-di+y 



F{u)du ± VsC'^(y) 



Q°°(Ti,y) G As 



F(u)du — As 



Ti-Ti-\-Wi+y 



F(u)du ± ^sC't{y) 



Ti—Ti- 1 -w i ~di+y 



As / F(u)du — As 



F(u)du — As 



Ti-Ti-i-Wi+y 



F{u)du ± VsC'C(y) 



T i -T i ^ 1 -Wi-d i +y 



(50) 



Now note that 



As 



/•oo 

/ F(u)du + As 

Jfi-Ti-i+y 



Ti-Ti_i-iUi+j/ 



Ti—Ti-\—Wi — di+y 



F(u)du < 2As / F(u)du 

«(»)+!/ 



and we claim that it is further bounded from above by \/sC^(y) for arbitrary constant C when s 
is large enough, uniformly over y G [0,oo). In fact, we have v(s) > inf{y : s F(u) < a} for 
any a > when s is large enough. Now when \fsCt;(y) < a/(2A), s J v ^ +y F(u)du < s F(u)du 
which is smaller and decays faster than y/sC^{y) when s is large. When ^fsC£,{y) > a/(2A), we 



31 



have s J™ s)+y F(u)du < s C\ F{u)du < a/(2A). Picking C* = C + C where C* is defined in glj, 

we conclude that ^ implies is reached at Tj. 

Now let iV = inf{« : occurs }. Consider (suppressing the initial conditions), for any p > 0, 







" N 




£< 




.i=l 




oo 

£< 

.i=i 



/ oo 

+ + «)M)V(P9)(p(JV > i))V(pr) 



< 



(51) 



vi=l 



where q,r > and 1/q + l/r = 1, by using Minkowski's inequality and Holder's inequality in the 
first and second inequality respectively. 
For i = 2, 3, . . ., we have 



£(«;, + dj + z) pq < [(Ew 



M\l/(pq) 



(52) 



by Minkowski's inequality again. 

We now analyze E{wi + di + z) p for any p > 0. From now on C denotes constant, not necessarily 
the same every time it appears. First note that 

{Ed p ) 1/p < d {p) := sup(S[<|S(T i _i + Wi ) = b]) 1/p = - sup(E[(U° - b) p \B°(0) = b]) 1 ^ = O (-) 

(53) 

and z < v(s) + A + to = o(s € ) for any e > 0. The last equality of (53) comes from the light-tail 
assumption on U°. Indeed, since U° is light-tailed, we have 



cxp 



hu(u)du > = Fjj(x) < e 



for some c > 0, where hjj(-) and F(x) are the hazard rate function and tail distribution function of 
U° respectively. This implies that h(x) > c for all x > 0. Then 



( rx+b 

sup P(U° - b > x\U° > b) — Slip GXp \ — / i 

fe>0 b>0 I Jb 



h(u)du > < e 



and so 



sup E[(U° - b) p \B°(0) = b] = supp / x p - l P{U° - b > x\U° > b)dx <p x p ~ l 

b>0 6>0 JO JO 



e cx dx < oo 



For i = 1, wi < l(s) + to = o(s € ) where l(s) is defined in (14). Hence E(w\ + d\ + z) p < 
[(Ew^y/P + {Ed p ) l l p + zf = o{s e ) for any e > 0. 
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Now 

Ew\ < E 
= E 



max Vi 

i=l ) ...,JV a (r i _ 1 )-JV,(T i _ 2 ) 



E 



max Vi 
i=l,...,JV 8 (T i _i)-JV i (T i _ 2 ) 



i-l 



i-2 



< CE^NsfTi-!) - N s (T t _ 2 )) p } for some constant C = C(p) and Z p (.) denned in (grj) 

< CE[(JV,(Ti_i) - iY s (T;_ 2 )) e ] for constant C = C(p, e) (54) 

for any e > 0, by Lemma [TTj Pick e < 1. By Jensen's inequality and elementary renewal theorem, 
( 54 ) is less than or equal to 

CiElN^T^) - N s {T^ 2 )]f 
= C(E[N s (T t ^) - iV a (Ti_ 2 )|Ti_i - T^ 2 \y 
< C{E[\a(Ti-i - T;_ 2 )]) e for some A > A 
= C\ e s e (E[Ti^ - Ti- 2 ]) e 
= CAV(S[^_i+d i _i + ^) e 



Let m = E[wi + di + z}. We then have 

Vi = Cs e y e i-i + d {1) + z 
By construction > to> and since v(s) = o(s e ) for any e > we have 

d m +z< Cs% < Cs e yl 
for large enough s, uniformly over i. Hence 

Vi < CsYi-x + d W + z < Cs e yf-i 

Now we can write 

< Cs'yU < af(c» e yUY = c 1+e a e+ *yf 



i-2 



...<(C 1/(1 ~ £) VlK /(1 - £) yf 

for any p > by choosing e, uniformly over i. 
Therefore from (52), (55) and (56), we get 



o(s p ) 



E{wi + di + z) pq = o(s e 



for any e > uniformly over i. 
Now consider 



P(N>1) = P((1) = l-P(( 1 ) 



< l-P[d^< 



(55) 



(56) 



(57) 



where C 2 is defined in Lemma [9] and c is defined in the discussion of Q 

< i _ be -a(m+z) 

= l-6e-° (s£) (58) 
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for some constants a > and < b < 1 and any e > 0. Moreover, for i = 2, 3, . . ., 

P(N >i) = P(N > i - l)P(&.i\N >i-l) 

< P(N > i - 1)E[1 - be~ a ^- 1+z) \N >i-l\ 

< P{N > i - 1)(1 - be-a(E[wi-i\N>i-l]+z)^ (59) 

by Jensen's inequality and that the function 1 — be~ a ^' +z ^ is concave. 
Consider E[wi\N > i] for any i = 2, 3, We have 

E[wi\N >i] = E[E[ Wi \Q-i, Wi-i + di-i + z]\N > i] (60) 
Now by singling out failure in the first trial of to (see the discussion on £J, we get 

P(C-_iki-i +<k-i + z) > c 3 
where C3 is defined in Lemma [9j uniformly over tUi_i + + z. Hence 

C 3 ^[iWi|Ci_i,w;i-i + + 2] < J P(Q_ 1 \w i -i + di-i + z)E[w i \Ci_ 1 ,Wi- 1 + dj-i + z]P(u>i_i + + z£ dx) 



which gives 



uniformly over it>j_i + + z. Therefore ([60]) is bounded from above by Ewi/C; 



From (55) and (56) we know that Ewi = o(s e ) for any e > 0. So ( |59| ) is less than or equal to 
P(N > i - 1)(1 - 6e-«(J5«>*-i/C!>+*)) = p(jv > i - l)(l - fre" ^) (61) 
for any e > uniformly over i. 



By (51), (58), (57) and (61) we get 



Et p < o(s e ) jT(P(iV > i)) 1 



/(pr) 



vt=l 

' 00 



< o(s e ) - 6e-° (s£) ) i/(pr) J 

1 

- °^ ^[l_(l_6 e -o( S £ ))l/(pr)]p 

< o(s e )e o(s£) 



Hence 



1 , ^ r, e o(s e 
-logPr p <- + 

s s s 



as s — > 00. On the other hand, we pick A such that t^4 > A and so 



1 logPr^ > 1 logA p 
s s 
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Conclusion follows for 

For @, note that N A < N s (t a ) < N s (t' a ) and EN s (t)P = O(st) since (1/s) logEe eN °W 
—ip N (6)t. Hence 

EN s {r' A f < 0(sV)E{t' a Y 
and the result follows from ([3]). □ 

Remark 6. The proof of Proposition^ can be simplified when the service time has bounded support, 
say on [0, M]. In this case the GI/G/oo system is "M + Uo-independent" i.e. , the state of the 
system at time t and W An t +1 +m> the state of the system at M time units after the first arrival 
since time t are independent. As a result we can merely set v(s) = M and X{ = M for any i, and 
the same argument as above will apply. 



5 Numerical Example 

We close this paper by a numerical example for GI/G/s. We set the interarrival times in the 
base system to be Gamma(l/2, 1/2) so A = 1. For illustrative convenience we set the service 
times as Uniform(0, 1). Hence traffic intensity is 1/2. In this case, we can simply set C* = 1 and 

= sd(i?(oo,y)) V Ci = J\J™ F(u)F(u)du + \c 2 a J~ F{ufdu V C x with d = 1.1 (note that 

r\ = and we use a truncated £(?/); the validity of this simpler choice than the one displayed in 
Section 2.1 can be verified from the arguments in Section 4 specialized to the case of bounded 
service time). Also we choose A = 1. To test the numerical efficiency of our importance sampling 
algorithm, we compare it with crude Monte Carlo scheme using increasing values of s, namely 
s = 10, 30, 60, 80, 100 and 120. 

As discussed in Section 2, since we run our importance sampler everytime we hit set A, the initial 
positions of the importance samplers are dependent. To get an unbiased estimate of standard error 
we group the samples into batches and obtain statistics based on these batch samples (see Asmussen 
and Glynn (2007)). To make the estimates and statistics comparable, for each experiment we run 
the computer for roughly 120 seconds CPU time and always use 20 batches. In the tables below, 
we output the estimates of loss probability, the relative errors (ratios of sample standard deviation 
to sample mean) and 95% confidence intervals for both crude Monte Carlo scheme and importance 
sampler under different values of s. 

When s is small we see that crude Monte Carlo performs slightly better than our importance 
sampler. However, when s is over 80, importance sampler starts to perform better. When s is above 
100, crude Monte Carlo totally breaks down while our importance sampler still gives estimates that 
have encouragingly small relative error. 



Crude Monte Carlo Importance Sampler 



s 


Estimate 


R.E. 


C.I. 


Estimate 


R.E. 


C.I. 


10 


0.05318 


0.0265 


(0.05252,0.05384) 


0.05412 


0.130 


(0.05084,0.05740) 


30 


0.003174 


0.111 


(0.003009,0.003338) 


0.003204 


0.570 


(0.002349,0.004060) 


60 


7.0922 x 10~ 5 


1.388 


(2.4847 x 10~ 5 , 1.1700 x 10" 4 ) 


6.2585 x 10~ 5 


2.258 


(-3.5529 x 10" 6 , 1.2872 x 10" 4 ) 


80 


6.9444 x 10" 7 


4.472 


(-7.5904 x 10" 7 , 2.1479 x 10~ 6 ) 


4.5001 x 10" 8 


1.879 


(5.4365 x 10" 9 , 8.4565 x 10" 8 ) 


100 





N/A 


N/A 


8.1178 x lO" 10 


2.296 


(-6.0511 x 10 -11 , 1.6841 x 10" 9 ) 


120 





N/A 


N/A 


1.3025 x 10~ 10 


4.472 


(-1.4237 x 10~ 10 , 4.0286 x lO" 10 ) 



We can also analyze the graphical depiction of the sample paths. Figures 6 and 7 are two 
sample paths run by Algorithm 2, initialized at the mean of Q(t,y) i.e. As J°° F(u)du. Figure 6 is 
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a contour plot of Q(t, y), whereas Figure 7 is a three-dimensional plot of another Q(t, y). As we can 
see, the number of customers (the color at the t-axis) increases from time to around 0.95 when it 
hits overflow in the contour plot. Similar trajectory appears in the three-dimensional plot. These 
plots are potentially useful for operations manager to judge the possibility of overflow over a finite 
horizon given the current state. 



Figure 6: Contour plot of Q(t,y) 



Figure 7: Three-dimensional plot of Q(t,y) 
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A Technical Proofs 
A.l Proof of Lemma CD 

The domain of V'i(') is easily seen to inherit from ip N (-). Write 

MO) = f ^ N {\og(e 6 F{u) + F{u)))du 
Jo 

Note that 

d ^ N (log(e 9 F(u) + F(u))) = ^(log(e 9 F(n) + F(u)))- '"' h ! 



d6^ ny bV v ; v rJVV m y ' v e d F{u) + F(u) 

is continuous in u and 9. Hence 

= f ^ N {\og(e e F{u) + F{u))) e /f^ , du 
Jo e"F(u) + F(u) 

(see Rudin (1976), p. 236 Theorem 9.42). Moreover, ijj' N {\og{e e F(u) + F{u)))e e F{u) / {e e F{u) + 
F{u)) is uniformly continuous in u and a neighborhood of 9, for any 9 £ K. Hence ip't(@) ^ s continuous 
in 9. Also the strict monotonicity of V'jv(') implies that ip' t (9) too is strictly increasing for any 9 > 0. 
Following the same argument, we have 



' " M< ■"*'(„) + /■■(«))) ( e /fi^ +^(log( e fl F( U )+F(u))) fl ffl^f » 2 *i 

e u F{u) + / [e^Fiu) + F(u))' L 
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which is continuous in 9. 

Finally, note that as 9 / oo, ip' N (log(e e F(u) + F(u)))e e F(u)/(e e F(u) + F(u)) /■ oo for any 
u G supp F since V'jv(') * s steep. By monotone convergence theorem we conclude that V't(') is steep. 

A. 2 Proof of Lemma [2] 

1) Denote #(t) = 9% for convenience. Since tp'ti') is continuously differentiable by Lemma [TJ by 
implicit function theorem, we can differentiate ip' t (9(t)) = at with respect to t on both sides to get 



rP' N (log(e e ®F(t) + F(t))) 



Mt)F(t) 



e e{t) F (t) + F{t) 



+ 



^(log(e^F(n)+F(n))) 



u 



+ ^ N (log(e e ^F(u) + F(u)))- 



e «(t)F(u) + 
F(u)F(u)e ^ 



emF{u) + F(u)) 2 



du9'{t) 



which gives 
9'(t) 



XF(t) - 4>' N {log(e 9 ^ F(t) + F(t)))e e ^F(t)/(e 9 ^F(t) + F(t)) 



It 



Mlog(em F (u) + F(u))) ( e ^;l (u) ) 2 + ^(log(e^)F(n) + F(u))) ( J^^ 



du 



< 

The inequality is due to the fact that 



9t (9) :=^ N (\og{e 9 F{t) + F{t))) 



e e F{t) 
e e F(t) + F(t) 



(62) 



is non-decreasing in 9 and gt(0) = XF(t), and that V'aK") is non-decreasing and convex. Hence 9{t) 
is non-increasing. 

2) Since at > 1 — XEV , 9t > #t where #t satisfies ifit(9t) = 1 — A£V, well-defined when i is small 
enough. Moreover, it is easy to check that iftt(0) < tp^(9)t for any 9, t > (either by the formula of 
and -i/i^y or by definition in terms of Gartner-Ellis limit). This implies that (■0^ _1 (y) > (ip'j^ 1 (y/t) 



for any y in the domain. Putting y = 1 — XEV gives Ot > (V£ ((1 - \EV)/t). By steepness of i/> 
we have (^'^((l - \EV)/t) / oo as t \ 0. So <9 t / oo as i \ 0. 

3) Consider i/4(0 t ) 



' iV 



a^, or #t = (^ (at). Now from (18) we have 



r N F(u) + F{u))) 



e e F{u) 



e e F(u) + F(u 



-du 



and that V'oo(^) is increasing in 9, by the same argument as in the proof of 1). Moreover, by 
monotone convergence we have ip t /* as * oo. 

By Billingsley (1979), p. 287, or Resnick (2008), p. 5, Proposition 0.1, we have (^t" 1 -»• (V^ 1 
as i oo. Moreover, since (V'f -1 is increasing over the compact interval [XEV, 1], the convergence 
is uniform. By Resnick (2008), p. 2, this implies continuous convergence, and hence (ip' t ~ 1 (at) — > 
(iP'- 1 (l),ov9 t ^ 9 00 . 



XF(t) 
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A. 3 Proof of Lemma |3] 

1) As in the proof of Lemma [2] Part 1, denote 9{t) = 6t- Consider 

p t = 9(t)XF(t) + e'(t)at - 4>'Mt))e'(t) - ^ N (\og(e e ^F(t) + F(t))) 
= 9(t)XF(t) - Mlog(e e V F(t) + F(t))) 

since ip' t (9(t)) = at- Note that h t {9) := if) N (log(e e F(t) + F(t))) is convex in 9 for any t > and so 

h t (9(t)) > ht(0) + h' t (0)9(t) 

which gives 

4> N (log(e d ^F(t)+F(t)))>XF(t)9(t) 
Hence (d/dt)It < and so It is non-increasing. 

2) Write I t = a t 9 t - ip t (9 t ). By Lemma @ Part 3, 9 t \ 9^ on [9^,9^ for t > T for some 
T > 0. Since tp t (9) is increasing in 9, by continuous convergence (see Resnick (2008), p. 2) we have 
MOt) -> ^oo(^oo). Hence 7 t -> J* defined in ph. 



3) Note that in case V is supported on [0, M], it is easy to check that It = lyi is the same for any 
t > M. Hence the conclusion. 

A. 4 Proof of Lemma 5] 

1) Following the spirit of the proof of Lemma|3]Part 1, denote 9{t) = 9% for convenience and consider 

jit = ti(t){l - XEV) - ^' N (9(t))t9\t) - ^ N (9(t)) = -i> N (9(t)) < 
for small t, using ip' N (9t)t = 1 — XEV. Hence the conclusion. 



2) Consider 9 t = (-0^- 1 ((1 — XEV)/t), well-defined by the strict monotonicity of ip' N . By steepness 
of Viv w e have (^^((l - XEV)/t) / oo as t \ 0. So 9 t / oo as t \ 0. 
Now write 

I t = 9 t (l- XEV) - fl> N @ t )t = (1 - AIW) [9t - ^§1) oo 

where the convergence follows from ^ and 1). 
A. 5 Proof of Lemma [5] 

To prove Lemma [5j we first need the following analytical lemma: 

Lemma 12. Let h m : D C M n — > M be a sequence of monotone functions, in the sense that 
h m (xi, X2, • • • , Xi-i,yi, Xi+i, . . . , x n ) is either non- decreasing or non-increasing in m fixing xi, . . . , Xi-i 
for any i = 1, . . . ,n. Moreover, suppose T> is compact. If h m — > h pointwise, where h is continuous, 
then the convergence is uniform over V. 
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Proof. Since T> is compact, continuity of h implies uniform continuity. Therefore, given e > 0, there 
exists 5 > such that ||xi — X2 1| < 5 implies \h(x\) — h(x2)\ < e. Compactness of T> implies that 
there is a finite collection of these 5-balls to cover D. Let {Ns(x)} x ^ be such collection. Note that 
h m — > h uniformly over £. 

For any x = (xi, . . . , x n ) E V, consider 

\h m (x - h(x)\ < \h m (x) - h m (x)\ + \h m (x) + h(x)\ + \h(x) - h(x)\ 

where x = {x\, . . . , x n ) is chosen to be the closet point to x in S that satisfies: For i = 1, . . . , n, 
Xi > Xi if h is non-decreasing in the i-th component, and < Xi if h is non-increasing in the i-th 
component. 

By construction we have |/i(x) — /i(x)| < 2e and |/i m (x) — /i(x)| < e when m is large enough. 
Now 

|/i m (x) - /i m (x)| 

= /i m (x) - h m (x) by our choice of x and monotone property of h m 

< h m (x) — h m (x) where x is chosen to be the closet point to x in £ that satisfies: 

For i = 1, . . . ,n, Xi < Xi ii h is non-decreasing in the i-th component, and 
Xi > Xi if h is non-increasing in the i-th component. 

< \h m (x) - h(x)\ + \h(x) - h(x)\ + \hm0t) - h(x)\ 

< e + 2e + e 

when m is large enough. 

Combining the above, we have \h m {x) — h(x)\ < 7e for all x G T>. Hence the conclusion. 

□ 

Proof of Lemma^ For convenience write ip s (0; w, z, t) =\ogEe Q ^ co] and 

ip(9;w,z,t) = I ip N (log(e 9 F(t-u) + F(t-u)))du 



defined for 9 € [0oo,0t], t >T and 0<w<z<t + r] for some r/ > 0. We can extend the domain 
by putting tp s (9; w, z, t) = ip s (6; w,t + 77, t) and ip(6; w, z, t) = ip(9; w, t + 77, t) for z > t + 77, and 
■0 S (#; z, t) = il)(9; w, z, t) = for w > z. 

Note that ip s (9; w, z, t) defined as such is non-decreasing in 9, non-increasing in w, non-decreasing 
in z and non-increasing in t. Also, ip s (9;w,z,t) — > ip(9;w,z,t) pointwise with il>(9;w,z,t) con- 
tinuous. Hence the convergence is uniform over the compact set 9 S [9oo,9t] and (w,z,t) £ 



[0, K + 77] x [0, i'T + 77] x [0, K] by Lemma 12 , for any > 0. By our construction we can extend 
the set of uniform convergence to (w,z,t) 6 [0,oo) 2 x [0, if]. 

We now choose K as follows. Given e > 0, there exists K > such that for all t > K, z < t — K, 
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w 
t—w 



t-z 

oo 



we have 

ij>(0;w,z,t)= I ^ N (log(e e F(t-u) + F(t-u)))du 
iP N (log{e e F(u) + F(u)))du 

< I ^ N (log(e e F(u) + F(u)))du 
Jk 

/•oo 

< d\ / log(l + (e e - l)F(u))du 

Jk 

/•OO 

< C 2 X / F(u)du 

Jk 

< e 

for some C±, C% > 0, uniformly over 9 £ [6oo, &t]- Hence for z < t — K, ifi s (8; w, z, t) < ip s (9; 0, t — 
K, t) — > ip(9; 0, t — K,t) < e uniformly over 9 G [Boo, &t] an d so \ift s (6; w, z, t) — ifi(8) w, z,t)\ < 3e for 
large enough s. 

For z > t — K, we write 

i rn , \ 1 i T-i 9Q°°. „\t,oo]I(w<t-K)+6Q?? „... [too] 

s 

which is bounded from above by 

1 log (Ee eQ ™^K [tMI{w<t ~ K) E Q e Q ^-*+^-^ [KM ^j 

= ib s (9;w,t- K,t)I(w <t-K) + -l O gE e eQ ^- t + K )^-^ [K,oo] 

s 

and bounded from below by 

1 log (Ee eQ ^-K [t '™ ]I{w<t - K) E 00 e 6Q ^-t+K)^-™) [KM ^ 

= iJ s (9-w,t-K,t)I(w<t-K) + -logE 00 e eQ ^- t + K ^(--^ [K ' oo] (63) 

s 

where Eq[-] denotes the expectation conditioned that a customer arrives at time and is counted 
in Qo*( z _t+K)A(z-w)\-t' °°]> w hile Eqq[-] denotes the expectation conditioned on delayed arrival with 
tail distribution (in the basic scale) given by sup fe P(U° — b > x\U° — b). Note that sup 6 P(U° — b> 
x\U° > b) is a valid tail distribution because of the light-tail assumption on U°. Indeed, it is obvious 
that sup 6 P(U° — b > 0\U > b) = 1, and by the same argument following that of (53), we have 



sup b P(U° — b>x\U°>b)<e cx — > for some c > 0. Moreover, it is obvious that sup b P(U° — b> 
x\U° > b) is non-increasing. Now by construction this tail distribution is stochastically at most as 

large as P(U°-b > x\U° > b) for any b > 0, and hence @. Note that \ \ogEoe^^- t + K )^- w ) [K ' co] 

1 6O 00 \K ool 

and - s logE'ooe v °.( z -*+^) A < z -») L ' 1 both converge to ip(9; 0, (z — t + K) A (z — w), K) uniformly by 
the argument earlier (as a special case when t < K). Also we have shown that ip s {9;w,t — K,t) 
converges to ip s (9; w,t — K, t) uniformly for t > K (as a special case when z < t — K and t > K). 
The sandwich argument concludes the lemma. □ 
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A. 6 Proof of Lemma |6] 

Consider 



- log E exp J (J2 °K E /((j - 1) A < ^ + ^ < j A) 

lk=l\j=k i=AT s ((fc-l)A)+l 
AT S (A;A) \ N 

+ 0*. ^ J(Fi + At > nA) U 

i=AU(fc-l)A)+l / J 



,((fc-l)A)+l 
n iV s (fcA) 



JlogEj] ^e^P((i-l)A<y i + A i <iA) + e efc F(nA-^ 

fe=li=JV s ((fe-l)A)+l Vi=fe 



-logSexpi V / 



n rkA 

hk(u)dN s (u) 

(fc-l)A 



where 



Now 



h k (u) = log I ^ e e *"P((j - 1)A < Vj + it < jA) + e % F(nA 



^ C n m 

-logSexp < ^ ^ /i fe (C 



fc=l 10 = 1 



iVJ(^-l)A + ^-^^-l)A+ ( "- 1)A 



m J \ m 



< -logSexp J V / 

< - logS exp \j2J2 hk tt 



h k (u)dN s {u) 

(fc-l)A 



kw) 

k=l w=l 



m J \ m 



where C kw = argmin{/i/%(u) : (k — 1)A + (w — l)A/m < u < {k — 1)A + wA/m} and ( kw = 
argmax{h k (u) : (k — 1)A + (w — l)A/m < u < (k — 1)A + wA/m}. The existence of C kw and £ kw 
is guaranteed by the continuity of h k (-), which is implied by our assumption that V{ has density. 
Letting s — > oo and by ([7]) we have 



n m ^ ( n 

J2J2^(h h (C k J)- < lim inf- logS exp ^ / 

< lim sup - log E exp < \ . \ 

s^oo S { k=l J(i 



h k (u)dN s (u) 

(fc-i)A 



h k (u)dN s (u) 

(fc-i)A 



n m ^ 
A;=l MJ=1 

By continuity of /ifc (•) and (")> ^iv (%(')) * s Riemann integrable. Letting m — )• oo yields the 
conclusion. 



41 



A. 7 Proof of Lemma M and llOl 



Our goal here is to prove Lemma 



via Lemma 



10 



For convenience let G{y) 



F(u)du 



where r\ is defined in (13). Note that by L'Hospital's rule and Assumption ([8]), we have 

vF{y) F(y) - yf(y) 

lim ^nT\ = lim WT~\ = lim KVKV) - 1) = oo 

2/->oo Lx{y) y^oo —r\y) y^oo 



(64) 



As discussed before, the key step to show Lemma [9] is an estimate of the limiting Gaussian 
process given by Lemma [10| The proof of this inequality takes three steps. We first consider the 
case when i = 1. The first step is to define a d-metric (in fact a pseudo- metric) 



di((t, y), (t, y')) = EiR^t, y) - y)f 



(65) 



where Ri(t,y) = Ri(t,y)/v(y) and show that the domain is compact under this (pseudo) metric. 
Then we can prove that the Gaussian process R\(t,y) is a.s. bounded by an entropy argument. 
The third step is an invocation of Borell's inequality. 

For convenience let S = [0,io] x [0, oo). 

Before these steps, we need an estimate of the d-metric: 

Lemma 13. Let (t, y) and (t', y') be two points on [0, to] x [0, oo). Without loss of generality assume 
t + y<t' + y'. Then 

A / t2 (F(t + y-u)- F(t' + y' - n))(l + Fjt + y - u) - F(t' + y' - u))du 

u(y) 2 



+x L 2p{t,+y '- u)Fit,+y '- u)du {w)-^) 

X J/ 1 F(h + yi - u)F(h +y 1 - u)du 



where t\ =t\/t' and y\ is the corresponding y or y' . 



(66) 



The proof of this lemma follows the approach in Lemma 5.1 of Krichagina and Puhalskii (1999). 
Hence we only sketch the proof here: 



Proof. (Sketch) Recall that 



Ri(t,y) 



fo Io° I( u + x > t + y)dK(u, : 



For a partition {no = 0, u%, U2, ■ ■ ■ , Uk} of [0, to], define 

k 



i=l 



Let 



Ik,t+y(u, x) = I(u G (lt»_i, Ui])I(x >t + y -m) 
So So°° h,t+ y {u, x)dK(u, x) 



R$(t,y) 
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be a discretized version of R\(t, y). One can check that R\(t } y) converges to R\(t, y) in mean square 
as the mesh of the partition goes to 0. 

Now take (t,y) and (t',y') in S such that t + y < t' + y' . Define t\ = t V t' and y\ be the 
corresponding y or y' , and define t<i = t At' and y2 be the corresponding y or y' . Also define k such 
that uj, < t\ while > t\. Using (5.4) and (5.5) in Krichagina and Puhalskii (1999), we have 

EiR^y)-^'^')) 2 
k 1 

= Yl -T^-2 X{ * Ui ~ Ui-lXFtf + V ~ u i) ~F(t + y- Ui ))(l + F(t + y- m) - F(t' + y' - it*)) 



+ E (^y - ^j) A ^ Ui - + - + y' - ^) 



it 



+ , . „ \{ui - Ui-i)F(ti + yi- Ui)F(ti +y\-Ui 



which converges to (66) as the mesh goes to 0. □ 



Lemma 14. We can compactify the space [0,io] x [0, oo] with the d-metric defined in (65). 

Proof. Consider the mapping (i,tan) : [0, to] x [0, 7r/2] — > [0,io] x [0, oo], where i is the identity 
map. Here the domain is equipped with the Euclidean metric while the image is equipped with the 
d-metric. We will show that the mapping (i, tan) is continuous and well-defined over its domain, 
including the points (t,x) where x = vr/2, and hence its image is compact. 

Suppose first that (t, x) — > (t*,x*) where x ^ ir/2. Since tan(-) is continuous, and Jy +y F{u)du 
and u(y) are continuous in t and y (under Euclidean metric), it is easy to see that di((t, tanx), (t* , tanx*)) 
by using ( 66 ) . 

We now show that d\(-, •) is still a (pseudo) metric when including the points (t, y) with y = oo. 
Define, for y' = oo, that 

diCCt,!/),^,!/)) 

A /J 2 F(t + y-u)(l + F{t + y-u))du f ^1^ F(t+ y -u)F(t+ y -u)du . f t > f 
v(y) 2 \ ift<f 

and di((t,y), (t',y')) = if y = y' = oo. It is straightforward to check that d%(-, •) is continuous 



at y' = oo by using ( |66[ ) (note that the second term of (66) goes to since for y' large enough 
it is less than or equal to A f^t^ F{du)du/v{y') 2 < AG(y) 1 " 2 /( 2+ ^ -»■ 0). Hence both the 
communtativity and triangle inequality hold also at y' = oo, which implies that d\(-, ■) is a pseudo- 
metric on [0, to] x [0, oo]. Now consider x* =n/2. It is now easy to see that d\((t, tana;), (t* , oo)) — > 
as (t,x) -)• (£*,7r/2). □ 

Lemma 15. E supg Ri(t,y) < oo. In particular, Ri(t,y) is a.s. bounded over S. 

Proof. We use C here to denote constants, not necessarily the same every time it appears. We 
carry out an entropy argument (see for example Adler (1990)) 

roo />diam(S')/2 

EsuvRi{t,y) < K \ H l / 2 {e)de = K \ H l / 2 {e)de 
s Jo Jo 
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where K > is a universal constant, H(e) = logiV(e) with iV(e) the e-th order entropy of S i.e. the 
minimum number of e-balls (under ci-metric) to cover S, and diam(S') is the diameter of S given by 
s u P(t,y),(t',y')esdi((t,y),(t , ,y')). 

As in Lemma [l3| let (t, y) and (f ,y') be two points on [0, to] x [0, oo] such that t + y < t' + y' , 
and let ti = tV t' with yi the corresponding y or y'. Note that from ( |66| ) we have 

A J * 2 (F(t + y - u) - F(t' + y' - u))du 
v(y) 2 



d 1 ((t,y),(t',y')) 



+ A / F(t' + y' - u)du — - - —— + 



/J 1_\ 2 \^F(t l + yi -u)dv 

v{y) v(y')J Kyi) 2 

A f t+y F(u)du ( fto+y _ rto+y' _ \ / 1 1 \ 2 

^ 7 ^ + A / A / F{u)du 1 ' » 



Ky) 2 yjj, Jy' J \ u (y) u (v') 

Kyi) 2 

< AG^) 1 - 2 /^) + A(G(y) 1 - 2 /( 2+ ") V Gtf) 1 - 2 '^) + AGfoi) 1 - 2 '^) 

< C(G(y) r, ^ 2+r) " ) V G(y') v ^ 2+V ^) (67) 

which implies that diam(S') is bounded. 

Now pick any e > 0. Since G(-) is continuous we can define G _1 (-) to be the inverse of G(-). 
From ([67} we have d x {(t,y), {t',y')) < e for y,y' > G'- 1 ((e/C) (2+r ' )/ ' 7 for some constant C > 0. 

Now also note that 

A f t2 (F(t + y-u)- F(t' + y' - u))du 



d x {{t,y),{t',y')) < 



< 



u{y) 2 

W^a^ '" - " + 1» - + C(G(! ' ) A q ^ gW Sw(^) l» - "'I 2 

where y is between y and y , by mean value theorem on 1/K") 
" G(y) 2 /( 2 +") A GWV+i) {lt ~ * 1 + lV ~ V l} + GlyjWS A G(y / ) 1 + 1 /(2+»?) |y " y 1 

C / / / 2 

- g^)^)/^ 1 ?) a c^cs+wm (I* - * I + ly - y I v ly - y I ) 



When at least one of y and y' is less than or equal to G 1 ((e/Cp 2+r, ''' n ), we then get 

C 

e (3+r?)A7 ' 



di((t, y), (f, y')) < -7^T-(\t - t'\ + \y - y'\ V |y - y'| 2 ) 



Hence we can fill up the space S by 

number of e-balls. By @ we get that G(y) < C/y 1 ^ for any p > 0, and so G~ 1 (e) < C7e 1/p - This 
gives 



(3+r))/r/ £ p / I e 2+(3+ ?7 )/ ? j+p 
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and hence 

r diam(5) / rC 



< OO 



□ 



a\ = sup ER^y)' 

s 



Lemma 16. Borell-TIS inequality holds i.e. for x > Esup s R\(t, y), 

P ^sup^i(t,y) > x"j < exp |~^~2 ( x - E sup Ri(t, y) 

where 
Proof. Note that 

E&i(t y)2 _ , R n t+y -u^ +y -^ k a|^. k AGW , /M 

and so 

a\ = sup ER x (t,y) < C 



for some constant C. By Lemma 15 R±(t,y) is a.s. bounded and Borell-TIS inequality holds. □ 



We now carry out the same scheme for R2(t,y). Let R,2{t,y) = R2(t,y)/iy(y). Indeed it is 
straightforward to show that the d- metric of R2(t,y) is given by 



<fc((f,v), (*'.</)) = E(R 2 (t,y) - R2(t',y')) 2 

7o V Ky) Ky') / % V Kyi) / 

(68) 

where again t\ = t V t' , t2 = t A t' and yi, y 2 are the corresponding y or y' . 



Lemma 17. VFe can compactify the space S with the d-metric defined in (68). 



Proof. For (t, y), (t' , y') such that y, y' ^ oo, write 
d 2 ((i,y),(t',y')) 



J ' 2 F{t + y- ufdu Jl 2 F(t' + y' - ufdu 2 J ' 2 F(t + y- u)F{t' +y'- u)d 



Ky) 2 Ky') 2 ^(y)^(y') 



Kyi) 2 
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and define, for y' = oo, that 



d 2 ((t,y),(t',y')) 



F(t + y-uf 



du 



and d 2 ((t, y), (t 1 , y')) = if both y, y' = oo. 

Then d 2 ((t, y), (t',y')) is continuous at y' = oo since 

rto+y' 



^F(t' + y'-u)du ff +y F{u)du = /{2+v) 
v(y') 2 v{y') 2 



and 



f Q 2 F{t + y- u)F(t' + y' - u)du < \J Jo 2 F(t + y- u) 2 du f Q 2 F(t> + y>- u) 2 du 



v{y)v(y') 



v(y)v(y') 



< 



C +y F{u)du 



u 



v v(y) 2 \ v{y') 2 

< G(y) ,?/(2(2+r ' )) G(y / )' 7/(2(2+r ' )) 




If t' > t, then 



//' F{t' + y'-ufdu ff +v 'F(u)du 



A2 



< 



u{y'f 



< G{y'yi l{ - 2+r,) -> 



Hence d 2 (-,-) is continuous at y' = oo. The rest follows as in the proof of Lemma 14 



□ 



Lemma 18. E sup 5 R 2 (t, y) < oo. In particular, R 2 (t,y) is a.s. bounded over S. 



Proof. From (68) we have the estimate 



d 2 ({t,y),{t',y')) 



< 2Xct 



(F{t + y-u) 



du V 



a \Jo \ Kv) > j 

< 2Xc 2 a (G(y)^^ V G{y'fl^) + \c 2 a G( yi f^ 2+ ^ 
On the other hand, using multivariate Taylor series expansion 
F(t + y- u) F(t' + y' - u) 



t2 / Fjh + yj-uY 

h \ Kyi) 



! du 
(69) 



< sup 

t,y 

< 



v(y) v(y') 
f(t + y-u) 



\t — t'\ + sup 



1 F(t + y-u)F(y) f(y) 



2 + T] G(y) 1 + 1 /(2+r ) ) G ( y )l/(2+r,) 



\y - y'\ 



c 



G(y)( 3 +'?)/( 2 +'?) 



(\t-t'\ + \y-y'\ 
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and hence 



d 2 ((t,y),(t ; ,y')) < 



C 



{\t-t'\ + \y-y'\ 



(70) 



where C are constants not necessarily the same every time they appear. With (69) and ( |70| , the 
rest follows as in the proof of Lemma [15) □ 



Lemma 19. Borell-TIS inequality holds i.e. for x > Esup s R-2(t, y), 

1 

2 [X — Ei 
•2 S 

where 

Ji „ Z7 1 £> / + „.\2 



P ( supi?2(£, y) > x ) < exp 



- £sup R 2 (t,y)Y 



a 2 = sup ER 2 (t,yY 
s 



Proof. Note that 



Ac 2 f* F(t + y- ufdu Xcl !l +v F(u)d 



ER 2 (t,yf 



< 



< \clG{ y y/^) 



The rest follows as in the proof of Lemma 16 



□ 



Lemma 10 is now an immediate corollary of Lemma 16 and 19 



Proof of Lemma 1 



P(\R(t,y)\ < C*v{y) for all t G [0,i ], V G [0,oo)) 



> P sup|i?i(t,y)| + sup|i? 2 (t,y)|<C* 

V s s 

> pLiplRtfayy < ^ ) P (sup \R 2 (t,y)\ < ^ 

> 

when C* is large enough, by the independence of •) and R 2 (-, •) in the second inequality. □ 



With Lemma 10 we now prove Lemma [9j 



Proof of Lemma^ First consider (42). Take C\ = 3C* where C* is the constant in Lemma 10 We 
have 



rt+y 

P ( Q°°{t,y) G ( As J F(u)du±yfsCiv(y) ) for all t G [0,t ], y G [0, oo ; 
> P\ U < x, G ( As / F(u)du ± \fsC\v(y) ) for t G [0, L/q] , y G [0, oo 



S(0) 



Q°°(t,y) G ( Asj F(u)du ± VsCi^(y) ) for all t G [E7b,to], 2/ G [0,oo) 



5(0) (71) 
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Letting x = l/(As), we will show that G \\s f* +y F(u)du ± ^/sCii>(y)j for i G [0, C/q] and 
y G [0, oo) in the expression is redundant. In fact, let m(s) = inf {y/sC*i>(y) < When 
y = m(s), As f^ +v F(u)du is less than 1 for large enough s, and when y > m(s) it decays faster than 

yfsC\v(y) < | (see Remark 1 in the paper for similar argument). Hence ^As fy +y F(u)du ± yfsC\v{y) 
contains when y > m(s). When y < m(s), the choice of x gives 



ft+y 

Xs / F(u)du < XstF(y) < Xsx = 1 
Jv 



for t G [0, Uq] and C/o < x. Hence [Xs f* +y F(u)du ± ^/sC\v(y) \ also contains when y < m(s). 
In fact with the same choice of x, by similar argument we have (^Xs Jy +y F{u)du db y/sC^v{y) 

contains only for t G [0, Z7o] and y > m(s), and that G ^As Jy +Uo+y F(u)du± ^/sC\v(y) \ for 

i G [0, Uq] and y > m(s). This will be useful later on in the proof. 

The same choice of x, together with the fact that F(-) is decreasing, also guarantees that 

rt+Uo+y _ 

Xs \ F(u)du < 2C*y/sv(y) (72) 



In fact, when y = m(s), Xs J^t Uo+v F{u)du is less than 1 when s is large enough, and when y > m(s) 



it decays faster than 2C*- v /sV(y). Hence the inequality (72) when y > m(s). When y < m{s) the 

rt+U 
lt+y 



fact that Uq < x leads to As f^t v ° +y F{u)du < 1, hence the conclusion. Again this will be useful 



later on. 



Hence ( 71 ) is greater than or equal to 

rt+u +y 



i / rt+uo+y _ _ \ 

P(U < x\B(0))P I Q8°(t,y) G J^As J F(u)du ± ^/sCC(y) j for all t G [0,t ] 



U <x 



where Qo°(i,y) is independent of {To and has the same distribution as Q°°(t,y) with initial age 
and no initial customers. 



IS 



For any Uq < x, we have 



t+Uo+y 



P Qo°(*.y) e [ Xs / F(u)d« ± y/adu(y) for all t G [0,t Q ], y G [0, 



, oo 



t+U +y 



> PyQ™(t,y)e (As / F(u)du ± \fsC\v(y) ) for all t G [0, to], y G [0, m(s)) 



t+y 



Q™(t,y) G As/ ± VsC*i/(y) for all t G [0, to], y G [m(s), oo) 



(since the interval ( As / F{u)du ± yfsC*v(y) ) only contains while 



G As 



F(u)du ± yfsC\v(y) I when y > m(s) as discussed above) 



> P sup 

V i/e[0,m(s)) 



+ sup A\/s 



ye[0,m(«)) ./t+i/ 



F(u)du < Civ(y), 



t+y 



> P 



Q™(t,y)e (As / F{u)du±yfsC*v(y)J for all t G [0, t ], y G [m(s), oo) 

< O(y) for all t G [0,t ], y G [0,oo) ] 



Q?(t,y)-\s C y F(u)du 



(by ©) 

P(|i?(t, y)\ < C*v{y) for all t G [0, t ], y G [0, oo)) > 



by Lemma 10 The convergence follows from Functional Central Limit Theorem (see Pang and 
Whitt (2009)) and that the set {/ : \f(t,y)\ < C*v(y) for all t G [0,t ], y G [0,oo)} is a continuity 
set. 

Lastly, since U° is light-tailed, by the argument following (53) in the proof of Proposition [l] we 
have 

1 



infP \U < , 
b>0 V As 



B(0) =b) = MP[U°-b<l 
J b>0 \ A 



U° > b > 1 - e- c/x > 



for some constant c > 0. Hence (42) holds. Inequality (43) is obvious since one can isolate any 
point inside S and the projection of the process on the point will possess Gaussian distribution. 
For example, we can write 



t+y _ 

P ( Q°°(t,y) <£ ( As / F{u)du±yfsCiv(y) ) for some f G [0,t ], y G [0,oo) 



P(0) 



> P(U <x)P\Q^(f,y*)> As 

> 

for any t* G [0, t ] and y* G [0, oo). 



F(u)du + J~sCiv(y*) 



□ 
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